Get the job you really want.

Top Reliability Engineer Jobs in San Francisco, CA

Reposted 18 Days AgoSaved
Easy Apply
Remote
San Francisco Bay Area, CA
Easy Apply
130K-140K Annually
Senior level
130K-140K Annually
Senior level
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
The Senior Site Reliability Engineer will manage system incidents, enhance monitoring and database infrastructure, and collaborate on scalable systems to maintain reliability as usage scales.
Top Skills: AWSClickhouseKubernetesMySQLPostgresRedis
Reposted 13 Hours AgoSaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
180K-250K Annually
Senior level
180K-250K Annually
Senior level
Cloud • Digital Media • Information Technology
Operate and improve Kubernetes-based production systems, manage cluster lifecycle and networking, build CI/CD and GitOps pipelines, define SLOs and incident response, automate resolution with AI, implement monitoring/alerting, and drive reliability through automation and chaos engineering.
Top Skills: AnsibleArgocdBashBgpCalicoCephCiliumCni PluginsCorootDatadogDnsEbpfFalcoFluxcdGoGrafanaKubernetesLokiLonghornMetallbPrometheusPythonSIEMTerraformThanosVictoriametricsVxlanXdp
Reposted 13 Hours AgoSaved
In-Office
San Francisco Bay Area, CA
181K-237K Annually
Senior level
181K-237K Annually
Senior level
Healthtech • Insurance
The Senior Software Engineer will lead complex projects, mentor engineers, and ensure cloud infrastructure is resilient and automated. Responsibilities include developing software, managing production environments, and enforcing coding standards.
Top Skills: ArgocdAWSGCPGithub ActionsGrafanaIstioKubernetesPrometheusTerraform
YesterdaySaved
In-Office
San Francisco Bay Area, CA
230K-390K Annually
Senior level
230K-390K Annually
Senior level
Artificial Intelligence • Software
As a Software Engineer on the Site Reliability team, you'll ensure system reliability, scalability, and observability while partnering with engineering teams and improving incident management processes.
Top Skills: AWSCi/Cd ToolingContainer OrchestrationDatadogGrafanaPrometheusTerraform
Reposted 14 Days AgoSaved
Remote
San Francisco Bay Area, CA
146K-162K Annually
Senior level
146K-162K Annually
Senior level
Healthtech • Software
The Database Reliability Engineer manages and maintains cloud-based database infrastructures for SaaS applications, focusing on automation, process improvement, and collaboration with engineering teams.
Top Skills: AnsibleAWSAzureAzure Data FactoryC#DatabricksGCPGitGrafanaInfluxdbMySQLPostgresPowershellPythonSQLSQL ServerTerraform
Reposted YesterdaySaved
In-Office
San Francisco Bay Area, CA
160K-200K Annually
Senior level
160K-200K Annually
Senior level
Aerospace • Hardware • Logistics • Robotics • Software • Transportation
The Senior Site Reliability Engineer will lead cloud infrastructure initiatives, develop best practices, write software, and manage systems while working closely with developers. They will also participate in an on-call rotation and set high technical standards for interviews.
Top Skills: AWSKafkaKubernetes
Reposted YesterdaySaved
In-Office
San Francisco Bay Area, CA
160K-250K Annually
Mid level
160K-250K Annually
Mid level
Artificial Intelligence • Cloud • Software
The Senior Site Reliability Engineer will automate operations, improve workflows, manage secure infrastructure, and participate in on-call rotation for an AI-driven company.
Top Skills: AristaAWSBashCephChefCifsCiscoDnsDockerElk StackFortinetHpHTTPIcmpIpIscsiJenkinsKubernetesLinux/DebianMesosphereNfsNode.jsPivotal GreenplumPostgresPythonRabbitMQRaidRubyS3ScyllaSshSslSupermicroTcpTlsUbuntu
Reposted 24 Days AgoSaved
In-Office
San Francisco Bay Area, CA
110K-170K Annually
Mid level
110K-170K Annually
Mid level
Robotics • Pharmaceutical
The Hardware Reliability Engineer ensures the robustness of robotic systems through testing, analysis, and collaboration across teams to improve designs and reduce risks.
Top Skills: Onshape CadPython
Reposted 24 Days AgoSaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
139K-178K Annually
Senior level
139K-178K Annually
Senior level
Artificial Intelligence • Information Technology • Machine Learning • Marketing Tech • Software • Biotech • Design
The Hardware Reliability Engineer plans and executes reliability testing, develops testing methods, performs failure analysis, and collaborates with cross-functional teams to ensure product quality.
Top Skills: Data AnalysisElectrical EngineeringEnvironmental ReliabilityMechanical EngineeringReliability Testing
2 Days AgoSaved
In-Office
San Francisco Bay Area, CA
255K-405K Annually
Mid level
255K-405K Annually
Mid level
Artificial Intelligence • Machine Learning • Generative AI
As a Software Engineer in Infrastructure Reliability, you'll design and build resilient systems, optimize performance, improve automation, and collaborate with teams to enhance infrastructure reliability.
Top Skills: AWSAzureCi/Cd PipelinesDatadogElk StackGCPGrafanaKubernetesPrometheusSplunkTerraform
25 Days AgoSaved
Hybrid
San Francisco Bay Area, CA
Senior level
Senior level
Artificial Intelligence • Information Technology • Software
Lead end-to-end platform reliability: define SLIs/SLOs, harden production architecture, ensure Kubernetes runtime and queue safety, run incident command for Sev1/Sev2, own observability/on-call/runbooks, and gate risky releases while delivering a prioritized reliability roadmap.
Top Skills: BullmqKoaKubernetesNode.jsPostgraphilePostgresReactRedisTypescript
25 Days AgoSaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
170K-250K Annually
Senior level
170K-250K Annually
Senior level
Artificial Intelligence • Software • Energy • Renewable Energy
Lead and own reliability engineering for solid state transformer products across the lifecycle: develop reliability guidelines, run DFMEAs and physics-of-failure tests, specify and analyze accelerated life tests, perform root-cause failure analysis, mentor a reliability team, collaborate with design/manufacturing/quality/suppliers, and monitor field performance to drive predictive health and reliability improvements.
Top Skills: 3D Finite Element ModelingAltaBayesian MethodsBlocksimCross-SectioningCsamEdxMaximum Likelihood EstimationMonte Carlo AnalysisOptical MicroscopyReliasoft Synthesis PlatformRgaSemSherlockWeibull DistributionWeibull++X-RayXfmea
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
25 Days AgoSaved
Hybrid
San Francisco Bay Area, CA
187K-225K Annually
Senior level
187K-225K Annually
Senior level
Artificial Intelligence • Machine Learning • Robotics • Software • Transportation • Design • Manufacturing
Lead reliability engineering for EV powertrain systems (EDU, high-voltage battery, power distribution). Define reliability targets, drive DFMEA, develop virtual and physical validation and PHM strategies, support field monitoring and corrective actions, embed durability in system design, and collaborate with suppliers and cross-functional teams to improve lifecycle reliability.
Top Skills: Sql,Pyspark,Python
Reposted 2 Days AgoSaved
Hybrid
San Francisco Bay Area, CA
130K-185K Annually
Junior
130K-185K Annually
Junior
Healthtech • Software
As a DevOps Engineer, you'll build and maintain scalable infrastructures, manage monitoring systems, provide operational support, and collaborate across teams to enhance the company's cloud environment.
Top Skills: AnsibleAWSAzureBashChefDockerGCPGithub ActionsJenkinsPostgresPuppetPythonTerraform
Reposted 2 Days AgoSaved
Hybrid
San Francisco Bay Area, CA
180K-275K Annually
Senior level
180K-275K Annually
Senior level
Financial Services
Design, develop, and deploy robust platform solutions while ensuring reliability, scalability, and security of the system. Collaborate with teams to enhance tooling and automation.
Top Skills: GCPKubernetesTerraform
22 Days AgoSaved
Easy Apply
Remote or Hybrid
San Francisco Bay Area, CA
Easy Apply
180K-220K Annually
Senior level
180K-220K Annually
Senior level
Healthtech • Information Technology • Software • Telehealth
The Senior Site Reliability Engineer will develop, monitor, and maintain distributed production systems, ensuring uptime for patients and providers while automating processes and supporting a large engineering team.
Top Skills: AWSDockerGCPKubernetes
17 Days AgoSaved
Remote
San Francisco Bay Area, CA
130K-160K Annually
Mid level
130K-160K Annually
Mid level
Artificial Intelligence • Big Data • Cloud • Software • Analytics • Infrastructure as a Service (IaaS) • Big Data Analytics
As an Airflow Reliability Engineer, you'll provide expertise in Apache Airflow, solve challenges for customers, and contribute to open-source projects, while enhancing your technical and customer-facing skills.
Top Skills: Apache AirflowAWSAzureDockerGCPKubernetesPostgresPythonSQL
Reposted 3 Days AgoSaved
In-Office or Remote
San Francisco Bay Area, CA
140K-205K Annually
Senior level
140K-205K Annually
Senior level
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills: AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Reposted 3 Days AgoSaved
In-Office or Remote
San Francisco Bay Area, CA
250K-295K Annually
Senior level
250K-295K Annually
Senior level
Artificial Intelligence • Software
Lead SRE pod and define reliability strategy. Scale ClickHouse and PostgreSQL for terabyte-level growth, optimize performance, build reliability patterns, automate operations, implement observability, and define SLOs and error budgets.
Top Skills: AlertingClickhouseDistributed TracingFailoverGoMetricsPartitioningPostgresPythonReplicationSlosTypescript
Reposted 4 Days AgoSaved
In-Office or Remote
San Francisco Bay Area, CA
160K-179K Annually
Senior level
160K-179K Annually
Senior level
Fintech • Payments
The Senior Staff SRE leads reliability engineering initiatives, drives operational excellence, mentors staff, and influences architecture to enhance system reliability and performance.
Top Skills: Ai/MlAWSAzureDockerElk StackGCPGrafanaKubernetesMySQLNoSQLPostgresSplunk
Reposted 22 Days AgoSaved
Easy Apply
Remote
San Francisco Bay Area, CA
Easy Apply
219K-245K Annually
Expert/Leader
219K-245K Annually
Expert/Leader
Big Data • Healthtech • HR Tech • Machine Learning • Software • Telehealth • Big Data Analytics
The Staff Site Reliability Engineer will architect, operate, and improve the platform while ensuring security compliance and enhancing development processes.
Top Skills: AWSElasticsearchIstioKubernetesNatsNode.jsPostgresPythonReactTerraformTypescript
Reposted 22 Days AgoSaved
Easy Apply
Remote
San Francisco Bay Area, CA
Easy Apply
186K-219K Annually
Senior level
186K-219K Annually
Senior level
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The Senior Site Reliability Engineer will build and scale identity management tools, automate operations, ensure security, and support AWS, GCP, and Azure environments.
Top Skills: AnsibleAWSAzureC#Cloud Identity ProvidersDockerGCPGoInfrastructure As CodeJavaKubernetesPythonRubyTerraform
23 Days AgoSaved
Remote or Hybrid
San Francisco Bay Area, CA
170K-215K Annually
Senior level
170K-215K Annually
Senior level
HR Tech • Information Technology • Professional Services • Sales • Software
Own and operate production-grade Kubernetes infrastructure on AWS, build GitOps CI/CD with GitHub Actions and ArgoCD, develop AI agents and internal DevOps tooling, maintain Datadog-based observability, and manage on-call incident response while collaborating with engineering teams to improve reliability and delivery speed.
Top Skills: Ai/LlmArgocdAWSCi/CdDatadogGithub ActionsGitopsGoKubernetesPython
Reposted 4 Days AgoSaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
130K-180K Annually
Senior level
130K-180K Annually
Senior level
Aerospace
As a Senior Reliability Test Engineer, you'll develop and implement reliability test strategies, work with engineering teams, and build a reliability test program to ensure product quality and longevity.
Top Skills: CPythonSQL
Reposted 4 Days AgoSaved
Hybrid
San Francisco Bay Area, CA
135K-285K Annually
Mid level
135K-285K Annually
Mid level
Software
As an AI Support Engineer, you'll manage support requests, resolve user issues, optimize ML models, and contribute to product development.
Top Skills: Tensorrt
All Filters
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account