Maximum of 25 job preferences reached.
Top Reliability Engineer Jobs in San Francisco, CA
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
The Senior Hardware Reliability Engineer ensures product reliability through planning, testing, and collaboration across engineering and operations. Responsibilities include leading investigations, analyzing failure data, and designing reliability strategies throughout the product lifecycle.
Top Skills:
Environmental TestingFailure AnalysisFirmware EngineeringHardware ReliabilityReliability ModelingStress Testing
Artificial Intelligence • Machine Learning
Own and modernize Domino's Tempest scale-testing platform; build repeatable automated validation, sizing guidance, and cloud-scale test automation; partner with platform teams to enable multi-cloud scale testing and improve test reliability and reporting.
Top Skills:
Ci SystemsCloud PlatformsCloud-Native ToolingEnd-To-End FrameworksKubernetesMulti-CloudPerformance/Load Testing FrameworksPythonTempest
Artificial Intelligence • Healthtech • Logistics • Social Impact • Software • Telehealth
The Senior Site Reliability Engineer will enhance the reliability and security of infrastructure for in-home healthcare services, using cloud technology and automation to improve systems and processes.
Top Skills:
AWSBashGCPPythonTerraformTypescript
Cloud
The role involves designing, optimizing, and maintaining PostgreSQL and MySQL databases, ensuring high availability, reliability, and performance for mission-critical systems, while automating operational tasks and responding to incidents.
Top Skills:
AnsibleAWSDatadogGCPGoGrafanaKubernetesMySQLPostgresPrometheusPythonTerraform
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer, you'll maintain and enhance the Currents data export system, focusing on observability, scalability, and reliability, while mentoring junior engineers and solving performance issues.
Top Skills:
BuildkiteDatadogDocker SwarmGitGitlabJavaJenkinsKafkaKotlinKubernetesMongoDBPagerdutyPostgresRubySentrySidekiqSnsSqs
Reposted 6 Days AgoSaved
Cloud • Software
Responsible for maintaining FedRAMP compliant services, designing infrastructure, monitoring systems, and ensuring security for federal regions, while driving automation and collaboration with development teams.
Top Skills:
AWSFedrampGoKubernetesPuppetPythonTerraformUnix/Linux
Security • Software • Cybersecurity • Automation
As a Senior Site Reliability Engineer, you will enhance the reliability of Drata’s product teams through automation, architecture reviews, and operational excellence using cloud-native technologies.
Top Skills:
AiopsAWSBashDatadogDockerGitGithub ActionsKubernetesLinuxMySQLPythonTerraform
Artificial Intelligence • Information Technology • Machine Learning • Natural Language Processing • Productivity • Software • Generative AI
As a Site Reliability Engineer, you'll build software to ensure system reliability, scale infrastructure, and deploy ML systems while collaborating with cross-functional teams.
Top Skills:
AWSAzureDockerGCPJavaKubernetesLinuxTerraform
Energy • Renewable Energy
The Staff Reliability Engineer will ensure hardware reliability in high-voltage electronics, develop reliability test programs, and collaborate on design and testing across teams.
Top Skills:
Hv ElectronicsPower ConversionPython
eCommerce • Retail • Software
As a Senior Database Reliability Engineer, you will manage database systems, enhance observability through automation, and lead database upgrade initiatives while ensuring security and reliability.
Top Skills:
AWSCi/CdDynamoDBElasticsearchMongoDBMySQLPostgresPowershellPythonRedisSQL Server
Fintech • Financial Services
The Staff Infrastructure Reliability Engineer leads Redfin's production database and storage systems, collaborating on strategies for reliability, scalability, and performance, while mentoring engineers and guiding complex technical discussions.
Top Skills:
AWSAws AuroraAws RdsAws S3DynamoDBElasticacheOpensearchPostgresPythonRdbms
Reposted 9 Days AgoSaved
Easy Apply
Easy Apply
AdTech
As a Site Reliability Engineer, you'll maintain the infrastructure for systems, ensure efficiency, automate processes, monitor databases, and participate in architecture discussions.
Top Skills:
Amazon KinesisAws LambdaAws SnsBigQueryDockerGcp (Google Cloud Platform)GitlabGoogle Cloud FunctionsGoogle Cloud RunGoogle Pub/SubGrafanaIstioKafkaKubernetesMySQLPrometheusSpannerSQLTerraform
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Reposted 12 Hours AgoSaved
Artificial Intelligence • Machine Learning • Natural Language Processing • Software • Conversational AI
The engineer will build and operate AI/ML infrastructure, managing services on AWS and bare metal, using tools like Kubernetes and Terraform.
Top Skills:
AWSBashGoKubernetesPythonSlurmTerraform
Blockchain • eCommerce • Fintech • Payments • Software • Financial Services • Cryptocurrency
The Senior Site Reliability Engineer will enhance reliability of Block's platform, improve incident response using AI tools, and coordinate incident management. Responsibilities include building reliable systems, standardizing tools, and leading high-severity incidents during on-call rotations.
Top Skills:
Amazon Web ServicesDatadogDynamoDBGrpcHTTPIstioJavaJSONKotlinKubernetesLaunchdarklyMySQLProtocol BuffersTerraformVitess
Marketing Tech • Mobile • Software
As a Senior Site Reliability Engineer at Braze, you'll ensure uptime for internal services, improve automation, and develop infrastructure tools, collaborating across teams to enhance reliability and scalability.
Top Skills:
ChefDockerKafkaKubernetesMongoDBRedisRuby On RailsTerraform
Artificial Intelligence • Fintech • Hardware • Information Technology • Sales • Software • Transportation
Design, scale, and manage AWS services for IoT devices. Collaborate on infrastructure, optimize performance, and ensure high availability of services.
Top Skills:
AWSBashGoHelmKubernetesPythonRubyTerraform
Reposted 2 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will develop and support distributed storage services, ensuring reliability and operational safety, with a focus on automation and efficiency.
Top Skills:
AWSAzureDnsGoGoogle Cloud PlatformKubernetesLinuxPythonTcp/IpTls
Big Data • Cloud • Software • Database
Seeking a Site Reliability Engineer with expertise in networking and distributed systems for building secure multi-cloud infrastructure. Responsibilities include maintaining network architecture and ensuring reliable service-to-service communication, involving a 24/7 on-call rotation.
Top Skills:
AWSAzureBgpDnsGCPIpv6KubernetesLoad BalancingMtlsService MeshTcp/IpTlsVpcsVpns
Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics
The Lead Site Reliability Engineer will oversee the Infrastructure SRE team, focusing on system reliability, automation, and mentoring while collaborating with product engineering.
Top Skills:
Ci/CdDatadogDockerElk StackGitopsGoKubernetesLinux/UnixNew RelicNoSQLPrometheusPythonSQLStackdriverTerraform
Artificial Intelligence • Information Technology • Machine Learning • Marketing Tech • Software • Biotech • Design
The Hardware Reliability Engineer plans and executes reliability testing, develops testing methods, performs failure analysis, and collaborates with cross-functional teams to ensure product quality.
Top Skills:
Data AnalysisElectrical EngineeringEnvironmental ReliabilityMechanical EngineeringReliability Testing
Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
The Staff Software Engineer will develop reliability software features for autonomous vehicles, focusing on multi-sensor systems and frameworks while collaborating with cross-functional teams and driving improvements in reliability and performance.
Top Skills:
C++11Embedded LinuxEmbedded SystemsPosixPython
Reposted 15 Days AgoSaved
Easy Apply
Easy Apply
Big Data • Cloud • Software • Database
The Senior Site Reliability Engineer will lead security design and implementation for cloud infrastructures, mentor teams, and automate security solutions.
Top Skills:
AnsibleAWSAzureCloud Security ToolsCloudFormationGCPGoTerraform
Artificial Intelligence • Hardware • Robotics • Software
The role involves developing and executing test strategies for autonomous systems, collaborating with engineering teams for reliability, and analyzing data for risk assessments.
Top Skills:
PythonSQL
eCommerce • Healthtech • Kids + Family • Retail • Social Media
Seeking a Senior Software Engineer, Site Reliability to ensure system stability, scalability, and reliability, while optimizing AWS infrastructure using modern DevOps practices and tools like Terraform, Docker, and Kubernetes.
Top Skills:
AWSCircleCICronitorDatadogDockerGithub ActionsJenkinsKubernetesMySQLPagerdutyReactRedisRuby On RailsSentrySidekiqTerraform
Artificial Intelligence • Fintech • Machine Learning • Social Impact • Software
As a Principal Software Engineer on the SRE team, lead best practices adoption, mentor engineers, and improve system reliability and user experience through automation and collaboration.
Top Skills:
CdkCloudFormationDatadogGoJavaScriptPrometheusPythonTerraformTypescript
Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.
Success! We'll use this to further personalize your experience.
Top San Francisco Companies Hiring Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results




.jpg)










.png)















