The Scene Understanding Semantic Reasoning team at Zoox builds the high-performance reasoning engines that allow our autonomous vehicles to navigate complex driving environments and high-speed roads. We translate sensor data and detected objects into deep semantic understanding, ensuring our robots make human-level decisions in real-time.
We are seeking experienced engineers passionate about the intersection of robotics and cutting-edge AI. In this role, you will focus on critical initiatives alongside partner Perception and motion planning teams to develop production-grade multi-task transformers, and integrate cutting-edge Vision Language Action (VLA) model outputs to build comprehensive spatial representations for our fleet. You will tackle the inherent unpredictability of urban driving on highways & freeways to improve range and accuracy, ensuring our vehicles remain safe and resilient at all times.
In this role, you will...
Model Training & Deployment: Design, train, and deploy deep learning models for semantic reasoning, specifically tailored to achieve the extended spatial range and high fidelity required for high-speed highway environments.
Cross-Functional Collaboration: Collaborate with the Scene Intelligence, Semantic Grounding, and PCP Mapping teams to adapt and elevate the unified machine learning stack for highway scenarios.
Requirements & Validation: Partner with downstream motion planning teams to define semantic representation requirements, establish robust validation workflows, and ensure model outputs meet strict safety and clearance metrics.
Optimization: Optimize deep learning models for real-time inference efficiency, ensuring low-latency execution within the rigorous compute constraints of the Zoox vehicle platform.
Edge Case Resolution: Investigate and resolve perception-related regressions and edge cases found in high-speed driving simulations and live fleet data.
Strategic Architecture: Contribute to the long-term "North Star" architecture for Perception Semantic Reasoning, paving the way for scalable fleet deployment across new vehicle platforms.
Qualifications
MS (3–5 years) or PhD (0–2 years) in Computer Science, Robotics, Electrical Engineering, or a related field, with professional software engineering experience — ideally in autonomous driving, robotics, or computer vision.
Deep understanding of 2D/3D computer vision, semantic segmentation, and deep learning architectures.
Exceptional programming skills in modern C++ and Python.
Hands-on experience with modern deep learning frameworks like JAX or PyTorch.
Proven track record of deploying real-time machine learning models on resource-constrained embedded systems or on-bot hardware.
Bonus Qualifications
Prior experience dealing with highway autonomous driving scenarios and their specific mapping/perception challenges.
Familiarity with state-of-the-art, BEV, Sparse Transformer architectures and Vision-Language Models (VLMs).
Strong publication record in top AI conferences or journals (e.g., CVPR, ICCV, ECCV, ICML, NeurIPS).
Zoox Foster City, California, USA Office
4000 E 3rd Ave, Foster City, CA, United States, 94404
Zoox Foster City, California, USA Office
1149 Chess Drive, Foster City, CA, United States, 94404
Zoox Fremont, California, USA Office
47540 Kato Road, Fremont, CA, United States, 94538
Zoox San Francisco, California, USA Office
60 Broadway St, San Francisco, CA, United States, 94111
Similar Jobs
What you need to know about the San Francisco Tech Scene
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine


