Archetype AI Logo

Archetype AI

Staff Backend Software Engineer: Inference

Sorry, this job was removed at 02:09 a.m. (PST) on Saturday, May 09, 2026
In-Office or Remote
Hiring Remotely in San Mateo, CA, USA
In-Office or Remote
Hiring Remotely in San Mateo, CA, USA

Similar Jobs

12 Minutes Ago
Remote or Hybrid
Santa Clara, CA, USA
221K-387K Annually
Expert/Leader
221K-387K Annually
Expert/Leader
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Lead strategy and execution for the Vault data & AI security product bundle, owning roadmap, cross-functional coordination, regulatory compliance, encryption, code signing, log export, and AI-native security features to scale monetizable, enterprise-grade security capabilities and drive adoption.
Top Skills: Agentic SystemsAICode SigningEncryptionIdentity And AuthenticationLog ExportProcess AutomationSecopsServicenow PlatformVault
12 Minutes Ago
Remote or Hybrid
Santa Clara, CA, USA
264K-449K Annually
Expert/Leader
264K-449K Annually
Expert/Leader
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Lead CEGs global partner strategy and execution for a $300M+ partner portfolio. Own partner governance, commercial management, vendor relationships, and partner-enabled delivery. Drive partner performance, capacity planning, executive relationships, strategic programs, and AI/automation-enabled service models while advising senior leadership and aligning cross-functional stakeholders.
Top Skills: AIAutomationServicenow
16 Minutes Ago
In-Office or Remote
79K-151K Annually
Mid level
79K-151K Annually
Mid level
Aerospace • Information Technology • Software • Cybersecurity • Design • Defense • Manufacturing
Serve as Engineering Application Captain for PCAS: capture customer requirements, develop change-request requirements, coordinate releases, lead RAM/kickoff/pre-lab meetings, coordinate SMEs, maintain application database, oversee testing/validation, and administer Teamcenter lab installations.
Top Skills: Teamcenter
About Archetype AI

Archetype AI is developing the world's first AI platform to bring AI into the real world. Formed by an exceptionally high-caliber team from Google, Archetype AI is building a foundation model for the physical world, a real-time multimodal LLM for real life, transforming real-world data into valuable insights and knowledge that people will be able to interact with naturally. It will help people in their real lives, not just online, because it understands the real-time physical environment and everything that happens in it.

Supported by deep tech venture funds in Silicon Valley, Archetype AI is currently at the Series A stage and is progressing rapidly to develop technology for their next stage. This presents a unique and once-in-a-lifetime opportunity to be part of an exciting AI team at the beginning of their journey, located in the heart of Silicon Valley.

Our team is headquartered in San Mateo, California, with team members throughout the US and Europe.

We are actively growing, so if you are an exceptional candidate excited to work on the cutting edge of physical AI and don’t see a role that exactly fits you below you can contact us directly with your resume via jobsarchetypeaiio.

About Job

We’re looking for a highly motivated backend engineer with extensive experience in designing and developing performant, scalable, and resilient inference services.

You’ll work closely with researchers, ML engineers, and product teams to bring cutting-edge AI capabilities into production—at scale, with reliability, and under real-world constraints.

This is an opportunity to own key services across our inference platform, from intelligent request routing to fleet-wide orchestration across diverse AI accelerators, and to contribute to some of the most advanced real-time AI serving systems in production today.

Core Responsibilities
  • Architect, implement, and maintain distributed inference serving systems that support high-throughput, low-latency model serving across multiple AI accelerator families and cloud platforms.

  • Enable breakthrough research by providing scientists with high-performance inference infrastructure to develop next-generation models.

  • Continuously optimize inference performance—including batching, caching, and request routing strategies—to maximize compute efficiency under explosive customer growth.

  • Build tooling and observability to monitor system health, identify bottlenecks, and proactively resolve instability.

  • Introduce new techniques, architectures, and best practices to push the limits of scalability, efficiency, and reliability.

  • Own problems end-to-end—from design to deployment—with a strong bias toward quality, automation, and continuous improvement.

  • Balance rapid iteration on early-stage systems with long-term maintainability and architectural soundness.

  • Contribute to a culture of engineering excellence, mentorship, and team-first collaboration.

Minimum Qualifications
  • 7+ years of professional software engineering experience, with a focus on inference.

  • Deep understanding of machine learning systems at scale including load balancing, request routing, or traffic management.

  • Experience with inference optimization, batching, and caching strategies

  • Ability to design APIs and service interfaces for real-time and latency-sensitive use cases..

  • Experience building and operating production-grade systems at scale in cloud environments (e.g., Azure, AWS, GCP).

  • Strong debugging, instrumentation, and observability skills across distributed systems.

  • Demonstrated ownership of complex technical problems and ability to learn and adapt quickly.

Preferred Qualifications
  • Proven track record of scaling systems through rapid growth and rebuilding or refactoring for new demands.

  • Experience building systems that degrade gracefully under load: backpressure, rate limiting, circuit breaking, bulkheading, and queuing.

  • Strong understanding of failure modes in distributed systems and mitigation techniques.

  • Proven experience owning high-availability services (e.g., SLOs, incident response, on-call), including capacity planning and load testing.

  • Proficiency in multiple programming languages (e.g., Rust, C++, Python).

  • Experience designing internal tools or platforms to support developer productivity and experimentation.

  • Strong product intuition, and ability to collaborate closely with cross-functional teams including research and design.

What We Value
  • Ownership – You take initiative, follow through, and care deeply about quality and outcomes.

  • Motivation – You’re driven to solve complex problems and continuously raise the bar for yourself and your team.

  • Excellence – You bring discipline, clarity, and rigor to your craft—and help others do the same.

  • Collaboration – You work well with others, mentor generously, and contribute to a high-trust, high-performance culture.

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account