Get the job you really want.

Top Reliability Engineer Jobs in San Francisco, CA

7 Days AgoSaved
Easy Apply
Remote
San Francisco Bay Area, CA
Easy Apply
164K-226K Annually
Senior level
164K-226K Annually
Senior level
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Senior Software Engineer focused on Site Reliability Tooling, you'll enhance system reliability, implement SRE practices, and build automation tools to support site reliability across Upstart's infrastructure.
Top Skills: CdkCloudFormationDatadogGoJavaScriptKubernetesPrometheusPythonTerraformTypescript
11 Days AgoSaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
160K-185K Annually
Senior level
160K-185K Annually
Senior level
Industrial • Manufacturing
Lead reliability strategy for HVAC components, develop predictive life models, design tests, and improve product durability. Analyze data and mentor junior engineers.
Top Skills: Accelerated Test MethodsHvac SystemsIot DevicesPredictive Life ModelsWeibull Modeling
11 Days AgoSaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
160K-185K Annually
Senior level
160K-185K Annually
Senior level
Appliances
Lead reliability strategies for HVAC components, develop predictive models, conduct tests, analyze data, and mentor junior engineers.
Top Skills: Accelerated Life TestingCorrosion TestingEnvironmental ChambersHvac SystemsReliability EngineeringVibration TablesWeibull Analysis
Reposted 7 Days AgoSaved
Remote
San Francisco Bay Area, CA
140K-210K Annually
Senior level
140K-210K Annually
Senior level
Sales • Software • Automation
Join the Infrastructure Team to build and maintain critical systems, automating database lifecycles and enhancing disaster recovery with a focus on resilience and simplicity.
Top Skills: AnsibleArgocdAWSClickhouseDockerElasticsearchFlaskGithub ActionsGrafanaKubernetesMongoDBPostgresPythonRedisTerraform
Reposted 2 Days AgoSaved
Easy Apply
Remote
San Francisco Bay Area, CA
Easy Apply
130K-150K Annually
Mid level
130K-150K Annually
Mid level
Marketing Tech
The Cloud Reliability Engineer develops, configures, and deploys cloud tools, enhances applications, ensures observability, and participates in on-call rotations.
Top Skills: AWSCi/CdDockerGithub ActionsGoGoogle BigqueryGCPKubernetesLinuxPythonSQLTerraform
Reposted 17 Days AgoSaved
Hybrid
San Francisco Bay Area, CA
180K-230K Annually
Senior level
180K-230K Annually
Senior level
Artificial Intelligence • Healthtech • Software
As a Site Reliability Engineer, you will manage cloud infrastructure, implement observability, and ensure system reliability by collaborating with engineering teams and maintaining databases.
Top Skills: AzureBashGitGitKubernetesPostgresPythonRedisSQLTypescriptVscode
Reposted 8 Days AgoSaved
Easy Apply
Remote or Hybrid
San Francisco Bay Area, CA
Easy Apply
127K-249K Annually
Senior level
127K-249K Annually
Senior level
Big Data • Cloud • Software • Database
This role involves building and maintaining observability services, ensuring service reliability, and collaborating with other teams on best practices.
Top Skills: AWSFluentbitGCPJaegerKubernetesAzureQuickwitSplunkVectorVictoriametrics
13 Days AgoSaved
In-Office
San Francisco Bay Area, CA
135K-155K Annually
Expert/Leader
135K-155K Annually
Expert/Leader
Information Technology • Security • Cybersecurity
Responsible for managing Oracle RAC databases, optimizing performance, ensuring security and integrity, and providing 24x7 support for production applications.
Top Skills: CassandraCephElasticsearchKafkaOracleRedis
14 Days AgoSaved
In-Office
San Francisco Bay Area, CA
152K-228K Annually
Senior level
152K-228K Annually
Senior level
Cloud
The role involves designing and optimizing PostgreSQL clusters, automating database tasks, and ensuring high availability and performance while collaborating with other engineering teams.
Top Skills: AnsibleDatadogGoGrafanaKubernetesMySQLPostgresPrometheusPythonTerraform
Reposted 6 Days AgoSaved
Remote
San Francisco Bay Area, CA
Senior level
Senior level
Artificial Intelligence • Cybersecurity
The Database Reliability Engineer will ensure database availability, performance, scalability, and security across AWS, collaborating with application and security teams.
Top Skills: AWSCrossplaneDatadogGitlab Ci/CdKubernetesNoSQLOpensearchPostgresTerraform
Reposted 21 Days AgoSaved
Easy Apply
Hybrid
San Francisco Bay Area, CA
Easy Apply
170K-220K Annually
Senior level
170K-220K Annually
Senior level
Artificial Intelligence • Machine Learning • Software
As a Staff Site Reliability Engineer, you will enhance the reliability, scalability, and performance of production services by applying SRE principles, implementing observability practices, automating processes, and collaborating with engineering teams.
Top Skills: AWSAzureCloudFormationDatadogDockerElk StackGCPGoGrafanaJaegerKubernetesOpentelemetryOpentofuPrometheusPythonTerraform
Reposted 12 Days AgoSaved
Remote or Hybrid
San Francisco Bay Area, CA
175K-175K Annually
Senior level
175K-175K Annually
Senior level
eCommerce • Legal Tech • Professional Services • Software • Data Privacy
The Site Reliability Engineer will ensure systems run smoothly, work with automation tools, resolve issues, and drive operational improvements.
Top Skills: AWSAzureCloudFormationDockerGCPGrafanaKubernetesMemcachedNew RelicOpentelemetryPostgresPrometheusPulumiRedisSentryTerraform
New

Track Smarter, Apply Better.

Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.

Use For Free
Application Tracker Preview
Reposted 13 Days AgoSaved
Remote or Hybrid
San Francisco Bay Area, CA
140K-215K Annually
Senior level
140K-215K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Senior Engineer will automate and ensure the reliability of large-scale distributed systems, troubleshoot server issues, and manage operational aspects for high availability and performance.
Top Skills: AnsibleC++ChefDockerElkFreenasGoGrafanaIscsiJavaKubernetesLinuxNasNfsObject StoragePrometheusPuppetPythonSanVMwareWindows
Reposted 18 Days AgoSaved
In-Office
San Francisco Bay Area, CA
150K-215K Annually
Senior level
150K-215K Annually
Senior level
Aerospace • Hardware • Logistics • Robotics • Software • Transportation
Design for Reliability Engineer responsible for ensuring the safety and reliability of drone-delivery systems through testing, statistical analysis, and innovative design solutions.
Top Skills: JmpMatlabMinitabPythonReliasoft
15 Days AgoSaved
Remote
San Francisco Bay Area, CA
148K-195K Annually
Mid level
148K-195K Annually
Mid level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
The Site Reliability Engineer will build and maintain infrastructure, improve software systems, develop scalable microservices, and ensure quality software delivery.
Top Skills: AWSGoGoogle Cloud PlatformJavaKubernetesAzureSQL
Reposted 24 Days AgoSaved
Hybrid
San Francisco Bay Area, CA
176K-241K Annually
Senior level
176K-241K Annually
Senior level
Fintech • Machine Learning • Payments • Software • Financial Services
Lead a team of developers to create cloud-based solutions while driving transformations using DevOps practices. Collaborate across teams to solve business challenges and mentor engineers.
Top Skills: AnsibleAWSDockerGoJavaKubernetesPythonRubySQLTerraform
Reposted 18 Days AgoSaved
Easy Apply
Remote or Hybrid
San Francisco Bay Area, CA
Easy Apply
118K-231K Annually
Senior level
118K-231K Annually
Senior level
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will support, maintain and grow the Atlas platform, focusing on automating processes and running multi-cloud environments.
Top Skills: AWSAzureDnsGCPGoHTTPLinuxPythonRubyTls
Reposted 4 Hours AgoSaved
In-Office or Remote
San Francisco Bay Area, CA
120K-160K Annually
Mid level
120K-160K Annually
Mid level
Consumer Web • Mobile
As a Site Reliability Engineer at Patreon, you'll improve AWS infrastructure, implement SRE practices, enhance Kubernetes capabilities, and develop automation tools.
Top Skills: AnsibleAWSChefKubernetesPuppetPythonTerraform
Reposted 4 Hours AgoSaved
In-Office
San Francisco Bay Area, CA
165K-250K Annually
Senior level
165K-250K Annually
Senior level
Artificial Intelligence • Healthtech • Information Technology • Software
As a Site Reliability Engineer, you will manage the production environment, focusing on infrastructure design, automation, and optimizing deployment pipelines to ensure high availability.
Top Skills: HelmKafkaKubernetesPostgresPythonRedisTerraformTypescript
Reposted 4 Hours AgoSaved
In-Office or Remote
San Francisco Bay Area, CA
205K-235K Annually
Senior level
205K-235K Annually
Senior level
Financial Services
The Senior Cluster Site Reliability Engineer will enhance the research compute cluster's uptime, reliability, and performance through engineering and operational improvements, ensuring high availability for researchers working on machine learning problems.
Top Skills: AnsibleAWSAWSCephDockerElkGCPGCPGrafanaHorovodHpcInfinibandKubeflowKueueLokiLustreMlflowOpentelemetryPodmanPrometheusPythonRdmaRubyS3SingularitySlurmTerraform
Reposted 4 Hours AgoSaved
In-Office or Remote
San Francisco Bay Area, CA
150K-250K Annually
Senior level
150K-250K Annually
Senior level
Artificial Intelligence • Software
The Network Operations Engineer will lead site operations, ensuring network reliability, handling incidents, coordinating hardware repairs, and supporting datacenter deployments. Responsibilities include executing maintenance runbooks and mentoring junior engineers while collaborating with cross-functional teams.
Top Skills: AnsibleBgpClos TopologiesEvpn/VxlanHigh-Radix SwitchingPython
Reposted 4 Hours AgoSaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
180K-440K Annually
Mid level
180K-440K Annually
Mid level
Information Technology
As a Site Reliability Engineer, you'll design and operate scalable storage systems and optimize performance for AI research data management.
Top Skills: GoKubernetesPulumiRust
YesterdaySaved
In-Office
San Francisco Bay Area, CA
196K-248K Annually
Senior level
196K-248K Annually
Senior level
Automotive
As a Senior Technical Program Manager for SRE & On-call Excellence, you will manage projects that improve incident response, on-call protocols, and system reliability, collaborating with various engineering teams to drive successful execution.
Top Skills: Cloud InfrastructureDevops PracticesDistributed SystemsSite Reliability Engineering
YesterdaySaved
Easy Apply
In-Office
San Francisco Bay Area, CA
Easy Apply
169K-276K Annually
Senior level
169K-276K Annually
Senior level
Energy
The Site Reliability Engineer will design and implement scalable systems, automate IT infrastructure management, and support deployed systems, ensuring high availability and performance.
Top Skills: Active DirectoryAnsibleAWSAzureChefJSONLinuxPuppetPythonRestVMwareWindows ServerYaml
YesterdaySaved
In-Office
San Francisco Bay Area, CA
130K-200K Annually
Senior level
130K-200K Annually
Senior level
Edtech
The Senior Site Reliability Engineer will ensure product reliability and performance, develop monitoring and alerting systems, and propose architectural changes.
Top Skills: AWSBashCC++DockerGCPJavaKubernetesPerlPython
All Filters
New Jobs
Job Category
Experience
Industry
Company Name
Company Size

Sign up now Access later

Create Free Account