Job Title, Company or Keyword

Maximum of 25 job preferences reached.

Top Reliability Engineer Jobs in San Francisco, CA

Coupa

Lead Database Reliability Engineer - 11606

3 Days AgoSaved

In-Office or Remote

San Francisco Bay Area, CA

142K-199K Annually

Senior level

142K-199K Annually

Senior level

Artificial Intelligence • Fintech • Information Technology • Logistics • Payments • Business Intelligence • Generative AI

Lead design, automation, and maintenance of cloud-based database infrastructure (primarily SQL Server and MySQL). Improve reliability with monitoring, HA/DR, automation, troubleshooting, on-call support, and mentoring of junior engineers while collaborating across teams.

Top Skills: AuroraAWSBashFailover ClusteringMySQLNew RelicOrchestratorPmmPythonRdsRubySQL ServerVividcortex

Grow Therapy

Senior Platform Reliability Engineer

5 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

182K-250K Annually

Senior level

182K-250K Annually

Senior level

Healthtech • Social Impact • Software

Define and scale reliability practices across the company by creating SLO/SLA frameworks, improving observability, evolving incident response, building self-service tooling and scorecards, and driving cross-team adoption to enable teams to build and operate reliable production systems at scale.

Top Skills: AWSDatadogEksKubernetesPostgresTerraform

Samsara

Senior Hardware Reliability Engineer

Reposted 13 Days AgoSaved

Easy Apply

Hybrid

San Francisco Bay Area, CA

Easy Apply

204K-240K Annually

Senior level

204K-240K Annually

Senior level

Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software

The Senior Hardware Reliability Engineer ensures product reliability through planning, testing, and collaboration across engineering and operations. Responsibilities include leading investigations, analyzing failure data, and designing reliability strategies throughout the product lifecycle.

Top Skills: Environmental TestingFailure AnalysisFirmware EngineeringHardware ReliabilityReliability ModelingStress Testing

Drata

Senior Site Reliability Engineer

Reposted 3 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

167K-226K Annually

Senior level

167K-226K Annually

Senior level

Security • Software • Cybersecurity • Automation

As a Senior Site Reliability Engineer, you will enhance the reliability of Drata’s product teams through automation, architecture reviews, and operational excellence using cloud-native technologies.

Top Skills: AiopsAWSBashDatadogDockerGitGithub ActionsKubernetesLinuxMySQLPythonTerraform

Airwallex

Senior Site Reliability Engineer, Spend

Reposted 4 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

160K-250K Annually

Senior level

160K-250K Annually

Senior level

Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI

Lead design and delivery of scalable cloud infrastructure for the Spend product. Embed with development teams to drive reliability, performance, observability, incident response, and automation. Own SLOs, runbooks, DevOps metrics, and collaborate with central DevOps and security teams to ensure compliance and resilience. Lead infrastructure projects including new service launches, data centre migrations, and modernising data pipelines.

Top Skills: Analytics PipelinesAWSData StreamingDevOpsGCPIncident ResponseKubernetesObservabilitySlosSre

Ōura

Hardware Reliability Engineer

Reposted 2 Days AgoSaved

In-Office

San Francisco Bay Area, CA

139K-178K Annually

Senior level

139K-178K Annually

Senior level

Artificial Intelligence • Information Technology • Machine Learning • Marketing Tech • Software • Biotech • Design

The Hardware Reliability Engineer plans and executes reliability testing, develops testing methods, performs failure analysis, and collaborates with cross-functional teams to ensure product quality.

Top Skills: Data AnalysisElectrical EngineeringEnvironmental ReliabilityMechanical EngineeringReliability Testing

Hebbia

Software Engineer, Site Reliability

Reposted 8 Days AgoSaved

In-Office

San Francisco Bay Area, CA

160K-300K Annually

Senior level

160K-300K Annually

Senior level

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Financial Services • Generative AI

As a Site Reliability Engineer, you'll design and improve critical production systems, lead incident response, and enhance observability while embedding with product teams to ensure reliability and performance at scale.

Top Skills: AWSC++Ci/CdGoPythonRust

Superhuman

Site Reliability Engineer

Reposted 8 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

214K-260K Annually

Senior level

214K-260K Annually

Senior level

Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI

The SRE will ensure the reliability of backend systems, scale Kubernetes-based control planes, and improve automation mechanisms while managing incident processes.

Top Skills: AWSAzureDockerGCPJavaKubernetesLinuxTerraform

CoreWeave

Senior Production Engineer (Reliability)

9 Days AgoSaved

In-Office

San Francisco Bay Area, CA

182K-242K Annually

Senior level

182K-242K Annually

Senior level

Cloud • Information Technology • Machine Learning

Own, build, and operate production reliability tooling and systems across the cloud stack. Lead projects to improve availability, scalability, automation, observability, and incident response. Ship production services in Python/Go, participate on-call, reduce toil through automation, and maintain long-lived platform frameworks.

Top Skills: Cloud-NativeGoGpu-Accelerated InfrastructureKubernetesMetricsPythonSlos/SlisStructured LogsTracing

Order.co

Senior Site Reliability Engineer

Reposted 21 Hours AgoSaved

Remote or Hybrid

San Francisco Bay Area, CA

175K-200K Annually

Senior level

175K-200K Annually

Senior level

eCommerce • Fintech • Payments • Software

The role involves ensuring software reliability and performance, managing incidents, developing infrastructure automation, and mentoring junior engineers within a platform team.

Top Skills: AWSCloudFormationDatadogKubernetesOpentelemetryRubyRuby On RailsTerraform

Attain

Sr/Staff Site Reliability Engineer, Consumer Apps

Reposted 10 Days AgoSaved

Easy Apply

In-Office

San Francisco Bay Area, CA

Easy Apply

Mid level

AdTech

As a Site Reliability Engineer, you'll maintain the infrastructure for systems, ensure efficiency, automate processes, monitor databases, and participate in architecture discussions.

Top Skills: Amazon KinesisAws LambdaAws SnsBigQueryDockerGcp (Google Cloud Platform)GitlabGoogle Cloud FunctionsGoogle Cloud RunGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLPrometheusSpannerSQLTerraform

MongoDB

Site Reliability Engineer (Senior or Staff), Infrastructure Security

Reposted 10 Days AgoSaved

Easy Apply

Remote or Hybrid

San Francisco Bay Area, CA

Easy Apply

127K-249K Annually

Senior level

127K-249K Annually

Senior level

Big Data • Cloud • Software • Database

The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.

Top Skills: AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform

New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free

OpenArt

Senior Platform & Reliability Engineer

5 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

Senior level

Artificial Intelligence • Machine Learning • Software • Generative AI

Help design, scale, and improve platform reliability: define SLOs/SLIs, run on-call and incident response, build observability, improve resilience to external dependencies, enhance CI/CD and deploy safety, optimize cost and capacity, and influence infrastructure architecture.

Top Skills: AmplitudeAWSCloud RunContainersEcsFargateFirebaseGCPKubernetesModalNext.JsNode.jsPythonReactRedisSentryServerlessTypescriptUpstash

Skydio

Senior Hardware Test and Reliability Engineer

Reposted 5 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

170K-226K Annually

Senior level

170K-226K Annually

Senior level

Artificial Intelligence • Hardware • Robotics • Software

The role involves developing and executing test strategies for autonomous systems, collaborating with engineering teams for reliability, and analyzing data for risk assessments.

Top Skills: PythonSQL

Cisco ThousandEyes

Senior Site Reliability Engineer (FedRAMP) - ThousandEyes

Reposted 11 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

147K-278K Annually

Senior level

147K-278K Annually

Senior level

Cloud • Software

Responsible for maintaining FedRAMP-compliant infrastructure, collaborating with software engineers, and ensuring system availability and security. Duties include infrastructure design, automation, monitoring, and incident response.

Top Skills: AWSGoKubernetesPuppetPythonTerraform

Deepgram

Site Reliability Engineer - AI & ML Infrastructure (Kubernetes, AWS & Terraform)

Reposted 2 Days AgoSaved

Remote

San Francisco Bay Area, CA

150K-220K Annually

Senior level

150K-220K Annually

Senior level

Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI

The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.

Top Skills: AWSBashGoKubernetesPythonSlurmTerraform

Skip Innovations

Hardware Reliability Engineer

Reposted 6 Days AgoSaved

Hybrid

San Francisco Bay Area, CA

160K-210K Annually

Senior level

160K-210K Annually

Senior level

Wearables

The Hardware Reliability Engineer will oversee reliability testing for Skip's wearable devices, manage FMEA, perform environmental stress testing, and lead root cause analyses to ensure product performance in real-world conditions.

Top Skills: Environmental TestingFmeaHaltHassThermal ChambersVibration TablesWeibull Analysis

Zocdoc

Senior Site Reliability Engineer

13 Days AgoSaved

Easy Apply

Hybrid

San Francisco Bay Area, CA

Easy Apply

210K-270K Annually

Senior level

210K-270K Annually

Senior level

Healthtech • Information Technology • Software • Telehealth

Lead reliability efforts for Zocdoc's cloud-based, consumer-facing services: monitor and maintain production systems, automate tooling and infrastructure, support scaling and performance, debug production incidents, and work with product teams to improve uptime and reliability.

Top Skills: AWSDistributed SystemsDnsDockerGCPGenaiHTTPHttpsKubernetesLoad BalancerMicroservicesNtpReverse ProxyTcp/IpTlsWeb Application Firewall

Block

Senior Site Reliability Engineer

Reposted 13 Days AgoSaved

In-Office or Remote

San Francisco Bay Area, CA

161K-284K Annually

Senior level

161K-284K Annually

Senior level

Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency

The Senior Site Reliability Engineer will enhance reliability of Block's platform, improve incident response using AI tools, and coordinate incident management. Responsibilities include building reliable systems, standardizing tools, and leading high-severity incidents during on-call rotations.

Top Skills: Amazon Web ServicesDatadogDynamoDBGrpcHTTPIstioJavaJSONKotlinKubernetesLaunchdarklyMySQLProtocol BuffersTerraformVitess

Runpod

Site Reliability Engineer

Reposted 4 Days AgoSaved

Easy Apply

Remote

San Francisco Bay Area, CA

Easy Apply

150K-200K Annually

Senior level

150K-200K Annually

Senior level

Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)

As a Site Reliability Engineer, you will ensure system stability and resilience, define reliability standards, and automate operational processes while collaborating cross-functionally to improve performance and reduce incidents.

Top Skills: BashCi/CdDockerGoGrafanaKubernetesLinuxPrometheusPython

Dropbox

Staff Site Reliability Engineer, Production Engineering

Reposted 4 Days AgoSaved

Remote

San Francisco Bay Area, CA

223K-302K Annually

Expert/Leader

223K-302K Annually

Expert/Leader

Artificial Intelligence • Cloud • Consumer Web • Productivity • Software • App development • Data Privacy

The role involves defining reliability strategies, leading initiatives across teams, enhancing monitoring and incident response, and mentoring engineers at Dropbox.

Top Skills: Ai TechnologiesDebuggingDistributed SystemsIncident ResponseObservabilityReliability Risk ManagementSlasSlos

Domino Data Lab

Staff Site Reliability Engineer

5 Days AgoSaved

Easy Apply

Remote or Hybrid

San Francisco Bay Area, CA

Easy Apply

200K-230K Annually

Senior level

200K-230K Annually

Senior level

Artificial Intelligence • Machine Learning

Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.

Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks

Amperesand

Reliability Engineer - Power Electronics

Reposted 9 Days AgoSaved

In-Office

San Francisco Bay Area, CA

115K-160K Annually

Entry level

115K-160K Annually

Entry level

Artificial Intelligence • Software • Energy • Renewable Energy

The Reliability Engineer will drive reliability for Solid-State Transformers, conduct test plans, analyze data, document risks, and support hardware deployment.

Top Skills: Ansys SherlockMatlabPythonRReliasoft

Embedding VC

Founding Platform & Reliability Engineer

Reposted 9 Days AgoSaved

In-Office or Remote

San Francisco Bay Area, CA

Senior level

Artificial Intelligence • Software • Generative AI

The Founding Platform & Reliability Engineer will design and operate reliable, scalable infrastructure for an AI storytelling platform, involving hands-on implementation and strategic decision-making.

Top Skills: AmplitudeAWSCloud RunFirebaseGCPModalNext.JsNode.jsPythonReactRedisSentryTypescriptUpstash

Redwood Materials

Reliability Engineer, Energy Storage

Reposted 10 Days AgoSaved

In-Office

San Francisco Bay Area, CA

123K-193K Annually

Senior level

123K-193K Annually

Senior level

Energy

The Reliability Engineer will define reliability requirements, conduct tests, analyze designs, and collaborate on hardware reliability within energy storage products.

Top Skills: JmpMinitabPython