Sauron Logo

Sauron

AI Inference Engineer

Reposted Yesterday
In-Office
San Francisco, CA, USA
175K-225K Annually
Mid level
In-Office
San Francisco, CA, USA
175K-225K Annually
Mid level
The AI Inference Engineer will optimize and productionize AI models for real-time applications on edge devices, enhancing robotic systems' performance.
The summary above was generated by AI
Who We Are
Sauron protects your family and home, bringing the innovations of autonomous robots and self-driving cars to residential security. Our team is led by veteran operators and engineers, alumni of Sonos, Paypal, Tesla, Apple, and Google. Sauron has raised an $27M seed round led by A* and Atomic with participation from other leading venture capital firms. 

The Role 
We’re looking for an AI Inference Engineer who lives at the boundary of high-performance software and physical hardware. In this role, you won't just be managing pipelines; you’ll be squeezing every drop of performance out of silicon to ensure our perception systems can see, think, and act in real-time.
You will own productionizing of AI - taking sophisticated models and transforming them into lightning-fast, production-ready engines running on edge devices in homes across the country. If you are obsessed with CUDA kernels, TensorRT optimizations, and the challenge of deploying robust vision systems on real robots, we want to talk to you.

What You’ll Do
  • Lead the development and optimization of low-latency inference engines using TensorRT and ONNX, including authoring custom plugins to support cutting-edge architectures.
  • Design and maintain multithreaded video processing and streaming pipelines (RTSP, RTP, HLS) using GStreamer and DeepStream.
  • Collaborate closely with embedded engineers to integrate perception software with Yocto platforms, ensuring seamless hardware-software synergy.
  • Work with raw data from cameras and LiDAR to enable real-time data capture, obstacle detection, and avoidance.
  • Write and optimize custom CUDA kernels and perform low-level GPU tuning to maximize throughput and minimize power consumption.
  • Productionize proven prototypes from Jetpack into Yocto
  • Apply advanced optimization techniques—including quantization (INT8/FP16), pruning, and distillation - to bring research-grade models to production-grade efficiency.

What You Bring
  • Bachelor’s or Master’s degree in Computer Science, Electrical Engineering, Robotics, or a related field.
  • 3+ years of experience developing and deploying computer vision or machine learning applications on real-world robotic systems (not just in simulation).
  • High proficiency in C, C++, and Python, with a focus on real-time and embedded systems.
  • Expert-level knowledge of the NVIDIA Jetson ecosystem (JetPack SDK, DeepStream, TensorRT) and a deep understanding of CUDA/GPU architecture.
  • Hands-on experience with video streaming tools like ffmpeg and protocols such as RTSP, RTP and HLS.
  • Proven track record of deploying AI systems that operate in the field, handling the unpredictability of real-world sensor data.

Nice to Have
  • Familiarity with NVIDIA’s broader robotics stack
  • Experience with ML compilers or compiler-level optimizations for GPU inference.
  • Specific background in sensor fusion and AI-driven obstacle avoidance for autonomous navigation.
  • Exposure to remote logging, log ingestion, and distributed telemetry aggregation.
  • Previous experience in early-stage startups or fast-paced hardware/software integration environments.

We Value
1. The Power of "We": “Align, then Accelerate”
  • We celebrate as a team and troubleshoot as a team.
  • The goal is the mission, not the credit.
2. High Challenge, Low Ego: "Respect the person, debate the idea."
  • Be ruthless with problems, but kind to people.
  • Raise the bar, lower the shield
3. Speak up: "Silence is a setback."  
  • Your perspective is a requirement, not a suggestion.
  • Speak the hard truths early so we can fix them fast.
4. Integrity in Motion: "Own the outcome, not just the task."
  • Do what you say you’ll do.
  • If it breaks, fix it. If it works, make it better.
5. Humanity at the Core: "Relationships over transactions."
  • Earn trust through empathy and consistency.
  • Anticipate needs before they become requests.

The compensation range for this position is $175-225k base + equity + benefits.

Why Sauron
You’ll be joining a deeply technical team obsessed with building real-world systems that make a tangible difference in people’s lives. We move quickly, iterate relentlessly, and ship with urgency - all while holding a deep respect for software craftsmanship and system reliability. If you're looking to solve challenging problems and own major parts of the deployment stack for a category-defining product, we want to talk.

We are focused on building a diverse and inclusive workforce. If you’re excited about this role, but do not meet 100% of the qualifications listed above, we encourage you to apply.
-----
Sauron is an Equal Opportunity Employer and considers applicants for employment without regard to race, color, religion, sex, orientation, national origin, age, disability, genetics or any other basis forbidden under federal, state, or local law.
Please review our CCPA policies here.
Compensation
The base pay range for this role is $175,000 – $225,000 per year.
HQ

Sauron San Francisco, California, USA Office

San Francisco, CA, United States

Similar Jobs

9 Days Ago
Hybrid
San Jose, CA, USA
197K-246K Annually
Mid level
197K-246K Annually
Mid level
Fintech • Machine Learning • Payments • Software • Financial Services
The Lead AI Engineer will develop and support AI software components, collaborate with cross-functional teams, and optimize large language models for performance and scalability.
Top Skills: AWSAzureGoGCPHuggingfaceJavaNemo GuardrailsPythonPyTorchScalaVectordbs
23 Days Ago
Hybrid
San Jose, CA, USA
230K-286K Annually
Senior level
230K-286K Annually
Senior level
Fintech • Machine Learning • Payments • Software • Financial Services
Lead the development of AI software components for Capital One, including model training, inference, and system optimization. Collaborate with cross-functional teams to deliver groundbreaking AI solutions, ensuring quality and scalability while mentoring junior team members.
Top Skills: AWSAzureC#C++GoGCPHuggingfaceJavaNemo GuardrailsPythonPyTorchScalaVectordbs
2 Days Ago
Hybrid
San Francisco, CA, USA
165K-330K Annually
Mid level
165K-330K Annually
Mid level
Software
The Software Engineer will lead the development of Baseten Voice AI, focusing on real-time model serving systems, collaboratng with teams on architecture and implementation.
Top Skills: DockerKubernetesPython

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account