You will…
- Define and drive the technical strategy for model distillation and compression across Waabi's AI stack — spanning perception, world models, and planning — with an eye toward both onboard deployment and simulation use-cases.
- Design, implement, and scale state-of-the-art distillation and efficiency pipelines, which may include:
Distillation for generative models (diffusion, autoregressive, flow-matching, video models)
Quantization-aware training (QAT) and post-training quantization (PTQ)
Knowledge distillation (feature-level, response-based, and relation-based)
Structured and unstructured pruning and sparsification
Low-rank factorization and efficient architecture design
Speculative decoding and other inference-time efficiency techniques
- Collaborate closely with ML Platform, Infrastructure, Onboard, Autonomy, and Simulation teams to integrate compressed models into production pipelines and meet latency, memory, and throughput targets across deployment contexts.
- Define rigorous benchmarks and evaluation frameworks to characterize efficiency vs. quality trade-offs across models and hardware targets.
- Mentor and guide researchers and engineers working in the distillation and model efficiency space, setting a high technical bar and fostering a culture of rigorous experimentation.
- Champion best practices for model compression across the organization; disseminate knowledge through internal design reviews, documentation, and technical talks.
- Stay at the cutting edge of model efficiency research; contribute to the broader scientific community through publications and open-source contributions.
Qualifications:
- Deep distillation expertise: You have extensive hands-on experience designing and implementing distillation, quantization, pruning, and model compression techniques for large-scale neural networks, with demonstrated impact in production settings.
- Strong research and engineering foundation: A Bachelor's or Master's degree in Machine Learning, Computer Vision, Robotics, or a related field, or equivalent industry experience; relevant hands-on experience in model distillation and efficiency is what matters most. Expert Python and PyTorch (or JAX) skills with experience in large-scale distributed training.
- Technical leadership: You have a proven track record of setting technical direction and driving projects from conception to production. You inspire and elevate those around you through deep technical expertise and mentorship.
- Cross-functional collaboration: You have experience working closely with infrastructure, platform, and autonomy teams to deploy compressed models under real engineering constraints.
- Clear communicator: You can communicate complex technical trade-offs clearly to diverse audiences and drive alignment across research and engineering teams.
Bonus:
- Experience with hardware-aware optimization (TensorRT, ONNX, custom CUDA kernels, hardware-specific quantization).
- Publications at top-tier ML/CV venues (NeurIPS, ICML, CVPR, ICLR, ECCV) in model compression, efficient deep learning, or related areas.
- Experience distilling large generative models (diffusion models, LLMs, VLMs, or video models).
- Background in autonomous vehicles or robotics.
Waabi San Francisco, California, USA Office
San Francisco, California, United States
Similar Jobs
What you need to know about the San Francisco Tech Scene
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine


.png)
