Job Description

About UsWe're a team of technologists from MIT, the University of Waterloo, and the University of Washington who know how to turn deep tech into products through multiple successful ventures.About the RoleWere looking for a Vision/SLAM-focused Robotics Software Engineer to push the boundaries of robot perception and mappingcombining state-of-the-art visual learning with proven SLAM and sensor fusion.You will architect and ship the perceptionmapping stack that powers intelligent behavior: visual/VIO pipelines with robust tracking, loop closure, and relocalization; semantic, multi-layer maps that planners and policies can act on; and world models that bridge simulation and real deployment. Expect to fuse RGB/RGB-D/LiDAR/IMU under strict latency/compute budgets, adapt modern ViTs/VLMs for 2D3D understanding, and deliver mapping-aware control that closes the loop from pixels to policies.What Youll DoOwn the mapping stack: design and ship visual SLAM pipelines (front-end + back-end) with robust tracking, loop closure, and relocalization under tight latency/compute budgets.Build semantic maps: fuse geometry with semantics into multi-layer maps usable by planners and policies.World models: develop learned predictive/latent-state models that capture scene dynamics and uncertainty; integrate them with control and task policies.Multi-sensor fusion: calibrate and fuse RGB/RGB-D/LiDAR/IMU/wheel odometry; handle time sync, extrinsics, and degraded sensing.Representation learning: adapt ViTs/VLMs for segmentation, detection, tracking, place recognition, and 3D understanding; learn scene graphs and object-centric representations.Advance the stack: explore beyond current VLAs (OpenVLA/RT-2/RT-X), adapt ViTs (DINO, SAM), VLMs (CLIP, BLIP-2, LLaVA), and diffusion planners (UniPi, Diffusion Policy) for mapping-aware control.What Were Looking ForSLAM expertise: visual/VIO/VSLAM experience (feature- or direct-based), bundle adjustment, factor graphs, pose-graph optimization, loop closure, place recognition, robust estimation.Semantic mapping: panoptic/instance segmentation, 2D-to-3D lifting, multi-layer map fusion, uncertainty modeling, lifelong/incremental mapping.World-modeling: learned state-space models, dynamics prediction.Strong CV & multimodal background: transformer-based models, self-supervised learning, tracking, foundation model adaptation for robotics.Engineering: C++ and Python; CUDA/TensorRT a plus; ROS2; strong profiling/latency discipline; productionizing perception systems on robots.Data: curation/augmentation for robotics; evaluation protocols.Sim + real: Isaac/MuJoCo/Habitat and on-robot bring-up; optimization libs (Ceres, GTSAM), geometric libs (OpenCV, Open3D).BonusDifferentiable SLAM or neural fields (NeRF/3DGS) integrated with classical stacks.Active perception, task-driven exploration, or belief-space planning.Publications at top venues (CVPR/ICCV/ECCV/CoRL/RSS/ICRA/IROS).Experience with large-scale multi-robot mapping or map compression/streaming.If you'd like to share more about your worksuch as papers, repos or demosfeel free to send your CV and links to are an equal opportunity employer and welcome applicants from all backgrounds.

Job Title

Company : ndimensions labs

Location : Toronto, Ontario

Created : 2025-12-15

Job Type : Full Time