Luminary Cloud Logo

Luminary Cloud

SRE

Reposted 23 Days Ago
Be an Early Applicant
In-Office
San Mateo, CA, USA
Senior level
In-Office
San Mateo, CA, USA
Senior level
Design, implement, and maintain scalable backend systems and APIs; build cloud infrastructure (preferably GCP) using Terraform; operate containerized workloads with Kubernetes; ensure reliability, security, and performance; participate in on-call rotations, architecture discussions, and cross-functional delivery.
The summary above was generated by AI

Luminary helps engineering companies be more competitive by getting to market faster, creating new, better products, and reducing development risk. We do this with our Physics AI platform, the fastest and easiest way to build and deploy models to understand and instantly predict physical reality with precision. Customers span industries from automotive and aerospace, to leading sporting equipment providers, including Otto Aviation, Joby Aviation, Piper Aircraft and Trek Bikes. Luminary is a Series B company and is headquartered in San Mateo, California.

About Luminary

Luminary helps engineering companies be more competitive by getting to market faster, creating new, better products, and reducing development risk. We do this with our Physics AI platform, the fastest and easiest way to build and deploy models to understand and instantly predict physical reality with precision. Customers span industries from automotive and aerospace to leading sporting equipment providers, including Otto Aviation, Joby Aviation, Piper Aircraft, and Trek Bikes. Luminary is a Series B company and is headquartered in San Mateo, California.


Role Description


The Luminary Physics AI platform is a SaaS offering that runs on GCP. It uses GPUs for data generation, model training, and mode inference and supports accelerated engineering design workflows. The product generates and consumes large volumes of data for Physics AI models and is used by some of the most demanding customers in automotive, aerospace and defense industries. An elevated security and compliance posture, the ability to maintain five-nine SLAs, use automation for most tasks and managing large data volumes make this an exciting opportunity for a production Site Reliability Engineer. 


The right candidate will apply software engineering principles to operations, focusing on system reliability, performance, and scalability. You will collaborate closely with engineering and product teams to design, deliver, and scale the core systems that power our platform. You will be responsible for suggesting product changes that allow us to manage 10k users simultaneously on the platform with effective resource management


Key Duties, Responsibilities, and DeliverablesReliability & Operational Excellence
  • Participate in on-call rotations and incident response, implementing effective remediation strategies and leading post-incident reviews to prevent recurrence
  • Define, monitor, and enforce Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to meet internal and external reliability targets
  • Apply software engineering practices to eliminate toil by automating operational tasks, improving overall efficiency, and contributing to the operational reliability of the platform
Infrastructure & Platform Engineering
  • Develop and enhance our cloud infrastructure (GCP preferred) through automation and Infrastructure as Code (Terraform)
  • Develop, oversee, and maintain operational systems (from deployment pipelines to orchestration layers) ensuring application health, reliability, and scalability using containerized solutions like Kubernetes
  • Execute scalability and performance optimization strategies to ensure systems efficiently handle increasing workloads and future growth
Architecture & Systems Design
  • Contribute to the design and implementation of highly-available and fault-tolerant systems, leveraging Service-Oriented Architecture (SOA) or microservices principles
  • Participate in architectural discussions that influence the platform’s long-term reliability, performance, and scalability
Security & Documentation
  • Collaborate with security experts to integrate IAM, authentication, authorization, encryption, and related best practices into the infrastructure
  • Create and maintain comprehensive documentation on system architecture, infrastructure, and security practices
Expertise and QualificationsRequired
  • Proven experience designing and implementing scalable SaaS backend systems
  • Strong understanding of cloud infrastructure (GCP preferred), CI/CD pipelines, and core SRE/DevOps concepts.
  • 5+ years of experience building performant, scalable, distributed systems (or equivalent experience).
  • 10+ years of experience required for Senior/Lead candidates.
  • Proficiency in Golang and Python is highly desirable.
  • Familiarity with Kubernetes and container orchestration.
  • Experience with Infrastructure as Code (Terraform) and cloud automation.
  • Strong understanding of operational practices and willingness to participate in on-call rotations.
  • Knowledge of modern security principles and IAM fundamentals.
Preferred Qualifications (Senior Candidates)
  • Demonstrated success scaling infrastructure in a startup environment, including multicloud, hybrid, or on-prem deployments.
  • Proven experience mentoring and guiding engineers, supporting technical growth and career development.
  • Ability to act as a technical architect, making high-impact design decisions for reliable, scalable, and secure platform systems.

Top Skills

Ci/Cd
Cloud Automation
Container Orchestration
Go
Google Cloud Platform
Iam
Infrastructure As Code
Kubernetes
Microservices
Python
Service-Oriented Architecture
Terraform
HQ

Luminary Cloud Redwood, California, USA Office

500 Arguello St, Suite 105, Redwood, California, United States, 94063

Similar Jobs

16 Hours Ago
Remote or Hybrid
US
65K-135K Annually
Mid level
65K-135K Annually
Mid level
Cloud • Insurance • Payments • Software • Business Intelligence • App development • Big Data Analytics
The Site Reliability Engineer will ensure system reliability and scalability, manage infrastructure, automate tasks, and collaborate cross-functionally while mentoring junior engineers and supporting production environments.
Top Skills: AnsibleArgocdBashDatadogGithub ActionsGitlabGoHashicorp ConsulHelmKubernetesPackerPostgresPowershellPythonSQL ServerTerraformTypescript
4 Days Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
As a Senior Site Reliability Engineer, you'll design and build complex systems, support Atlas platform operations, automate processes, and ensure high availability of services.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
6 Days Ago
Hybrid
79K-135K Annually
Mid level
79K-135K Annually
Mid level
Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
The Site Reliability Engineer will deploy and monitor various solutions, implement automation, support infrastructure, and work collaboratively across teams to ensure service excellence and continuous improvement.
Top Skills: AnsibleAzure StackCephHelm ChartsIaasJdfsKubernetesNfsObject StorageOpen StackPaasSaaSVMware

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account