Archetype AI Logo

Archetype AI

Site Reliability Engineer

Posted 2 Days Ago
Be an Early Applicant
In-Office
San Mateo, CA, USA
Senior level
In-Office
San Mateo, CA, USA
Senior level
The Site Reliability Engineer will own backend services and cloud infrastructure, focusing on system reliability, scalability, and operational excellence for Archetype AI's platform.
The summary above was generated by AI
About Archetype AI

Archetype AI is developing the world's first AI platform to bring AI into the real world. Formed by an exceptionally high-caliber team from Google, Archetype AI is building a foundation model for the physical world, a real-time multimodal LLM for real life, transforming real-world data into valuable insights and knowledge that people will be able to interact with naturally. It will help people in their real lives, not just online, because it understands the real-time physical environment and everything that happens in it.

Supported by deep tech venture funds in Silicon Valley, Archetype AI is currently at the Series A stage and is progressing rapidly to develop technology for their next stage. This presents a unique and once-in-a-lifetime opportunity to be part of an exciting AI team at the beginning of their journey, located in the heart of Silicon Valley.

Our team is headquartered in San Mateo, California, with team members throughout the US and Europe.

We are actively growing, so if you are an exceptional candidate excited to work on the cutting edge of physical AI and don’t see a role that exactly fits you below you can contact us directly with your resume via jobsarchetypeaiio.

About Job

This role will own the backend services and cloud infrastructure that power Archetype AI’s production platform—driving system reliability, scalability, and operational excellence as the company scales to meet growing customer and research demands. The engineer will work across the full stack of distributed systems and cloud platform concerns, from designing high-throughput services to provisioning and automating the infrastructure they run on.

Core Responsibilities:
  • Architect, implement, and maintain distributed systems that support high-throughput, low-latency AI model inference and data services.

  • Design, provision, and manage cloud infrastructure (AWS, GCP, and/or Azure) including compute, networking, storage, and IAM—using infrastructure-as-code tools such as Terraform, Pulumi, or CloudFormation.

  • Build and operate Kubernetes-based platforms for deploying and scaling production workloads, including GPU-accelerated inference services

Minimum Qualifications
  • 7+ years of professional software engineering experience, with a focus on backend or distributed systems.

  • Deep understanding of distributed systems fundamentals—concurrency, consistency, replication, fault tolerance, networking.

  • Hands-on experience building and operating production infrastructure in cloud environments (AWS, GCP, and/or Azure), including compute, networking, and storage services.

  • Working knowledge of container orchestration (Kubernetes) and infrastructure-as-code (Terraform, Pulumi, or similar).

  • Strong debugging, instrumentation, and observability skills across distributed systems and cloud infrastructure.

  • Demonstrated ownership of complex technical problems and ability to learn and adapt quickly.

Preferred / Nice-to-Have Skills
  • Proven track record of scaling systems through rapid growth and rebuilding or refactoring for new demands.

  • Experience designing and operating multi-region or multi-cloud deployments with high availability and disaster recovery.

  • Proficiency in systems programming languages (e.g., Rust, C++) and scripting environments (e.g., Python).

  • Experience with Kubernetes ecosystem tooling—Karpenter, Kueue, Helm, ArgoCD, or similar—for workload scheduling, autoscaling, and GitOps.

  • Familiarity with CI/CD systems, service mesh architectures, and secrets/config management at scale.

  • Experience with FIPS compliance, container hardening, or government cloud environments (C2S/SC2S, GovCloud).

  • Familiarity with modern ML stacks and hardware acceleration (e.g., PyTorch, CUDA)

Top Skills

AWS
Azure
C++
CloudFormation
Cuda
GCP
Kubernetes
Pulumi
Python
PyTorch
Rust
Terraform

Similar Jobs

Yesterday
Remote or Hybrid
Santa Clara, CA, USA
166K-290K Annually
Expert/Leader
166K-290K Annually
Expert/Leader
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The Technical Lead Site Reliability Engineer will drive reliability, lead a team, optimize infrastructure, and manage CI processes at Veza, focusing on cloud automation and SRE leadership.
Top Skills: AWSBazelGitopsHelmKubernetesLinuxTerraform
4 Days Ago
Hybrid
Menlo Park, CA, USA
169K-224K Annually
Senior level
169K-224K Annually
Senior level
Artificial Intelligence • Big Data • Healthtech • Machine Learning • Software • Biotech
Lead the design and operation of a fault-tolerant cloud infrastructure, implement infrastructure-as-code, manage Kubernetes reliability, and mentor engineers.
Top Skills: AnsibleAWSAzureBashCloudFormationDatadogGCPGithub ActionsGitlab CiGoGrafanaJenkinsKubernetesOpentelemetryPowershellPrometheusPythonTerraform
5 Days Ago
Remote or Hybrid
US
65K-135K Annually
Mid level
65K-135K Annually
Mid level
Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
The Site Reliability Engineer will ensure system reliability and scalability, manage infrastructure, automate tasks, and collaborate cross-functionally while mentoring junior engineers and supporting production environments.
Top Skills: AnsibleArgocdBashDatadogGithub ActionsGitlabGoHashicorp ConsulHelmKubernetesPackerPostgresPowershellPythonSQL ServerTerraformTypescript

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account