Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in San Francisco, CA
Edtech
As a Staff Site Reliability Engineer, you will lead reliability engineering by designing automation, scaling systems, driving architectural improvements, and mentoring engineers.
Top Skills:
ArgocdCi/CdCircleCIDatadogGCPGithub ActionsGoIstioKubernetesPythonTerraform
Edtech
Lead the technical vision for reliability at Quizlet by architecting self-healing systems, mentoring engineers, and improving infrastructure resilience.
Top Skills:
Ci/CdDatadogGoIstioJeliKubernetes (Gke)PythonTerraform
Financial Services
As a Site Reliability Engineer, you will enhance and monitor production systems, automate workflows, and respond to incidents to maintain system reliability.
Top Skills:
AirflowBazelGitGoGrafanaGrpcJenkinsKubernetesLinuxPandasPostgresPrometheusPythonRRelational DatabasesSQL
Cloud
The Staff SRE will ensure platform performance and reliability at scale, drive automation, and collaborate cross-functionally to safeguard service quality for customers.
Top Skills:
AWSAzureBigQueryGCPGoGrafanaKubernetesPrometheusPythonShellSQLTerraform
Artificial Intelligence • Healthtech • Other • Productivity • Telehealth • Conversational AI • Generative AI
As a Founding Site Reliability Engineer, you'll enhance system reliability, automate tasks, manage incidents, and mentor others. You'll shape infrastructure and ensure optimal performance and stability for healthcare services.
Top Skills:
AWSAzureDatadogGCPGrafanaHoneycombKubernetesOpentelemetryPagerdutyPrometheusSentryTerraform
Information Technology • Software • Big Data Analytics
The Site Reliability Engineer will design, analyze, and troubleshoot large-scale distributed systems, focusing on operating systems and performance tuning.
Top Skills:
ApacheJava
Reposted 4 Days AgoSaved
Easy Apply
Easy Apply
Cloud • Security • Software • Cybersecurity • Automation
As a Senior Site Reliability Engineer at GitLab, you will automate and manage the lifecycle of GitLab environments, ensuring reliability and scalability while leading incident responses and architectural decisions.
Top Skills:
AnsibleAWSElkGCPGoGrafanaKubernetesPrometheusRubyTerraform
Information Technology • Energy
As a Site Reliability Engineer, you will design high-availability systems, maintain security, troubleshoot production issues, and mentor the development team while ensuring best practices in infrastructure management.
Top Skills:
CloudInfrastructure-As-CodeProgrammingScripting
Information Technology • Other • Software • Consulting
The Site Reliability Engineer (SRE) will ensure system reliability and performance, automate operations, develop CI/CD pipelines, and manage cloud infrastructure.
Top Skills:
AnsibleAWSAzureDatadogDockerEcsJavaKubernetesPythonTerraformTerragrunt
Reposted YesterdaySaved
Easy Apply
Easy Apply
Edtech • Kids + Family • Sales • Social Impact • Software
The role involves designing, building, and maintaining cloud infrastructure on AWS, focusing on scalability, security, and developer efficiency while providing technical leadership to the team.
Top Skills:
AWSBuildkiteCi/CdCloudFormationCloudwatchDatadogDockerGithub ActionsJenkinsKubernetesPostgresPrometheusTerraform
Information Technology • Cryptocurrency
The Site Reliability Engineer will lead technical initiatives, architect solutions, troubleshoot issues, mentor team members, and improve observability practices.
Top Skills:
ArgocdBashElk StackGCPGoGrafanaHelmKubernetesPrometheusPythonTerraform
Gaming • Mobile • Software
As an SRE Manager, you will lead a team to enhance infrastructure services, manage incidents, and contribute to technical decisions while ensuring high availability and scalability of systems.
Top Skills:
Amazon AwsAnsibleArtifactoryCrossplaneDatadogElasticsearchGitlabGoGCPJaegerJenkinsKubernetesAzureMongoDBPackerPostgresPythonRedisTerraformVault
New
Cut your apply time in half.
Use ourAI Assistantto automatically fill your job applications.
Use For Free
Edtech
As an SRE Engineer Lead at Speak, you'll enhance system reliability, manage incidents, improve observability, and collaborate with cross-functional teams to scale infrastructure and ensure a seamless language learning experience.
Top Skills:
GCPKubernetesNode.jsPostgresPrometheusPythonRedisSentryTerraform
Artificial Intelligence • Machine Learning • Natural Language Processing • Software
The Public Sector Site Reliability Engineer will manage cloud infrastructure, ensure compliance with regulations, implement observability tools, lead incident response, and enhance automation in federal environments.
Top Skills:
AnsibleAws GovcloudAzure GovernmentBashDatadogDockerElasticGoGrafanaKubernetesPrometheusPulumiPythonTerraform
Consumer Web • eCommerce • Fashion • Retail
Seeking a Staff Software Engineer for the SRE team to enhance CI/CD systems, optimize infrastructure, and improve developer productivity. Responsibilities include architecting solutions, mentoring engineers, and driving technical initiatives to elevate operational excellence.
Top Skills:
AnsibleBashGithub ActionsGoHelmJenkinsKubernetesPythonRubySpinnakerTerraform
Blockchain • Software
As a Site Reliability Engineer at Offchain Labs, you will manage infrastructure in cloud environments, design CI/CD workflows, and enhance system reliability with a focus on blockchain technology.
Top Skills:
ArgocdAWSAzureCodebuildGCPGithub ActionsGoGrafanaKubernetesLokiPrometheusPythonTerraform
Fintech
As a Site Reliability Engineer, you will enhance system reliability through scalable infrastructure, observability practices, automation, and collaboration with engineering teams.
Top Skills:
AWSDatadogGoGrafanaJavaKubernetesNode.jsPrometheusPulumiPythonTerraform
News + Entertainment
As an Ads Reliability Engineer, you'll enhance the reliability of the Netflix Ad Suite by designing scalable systems, automating processes, collaborating with teams for observability, and responding to incidents while promoting a culture of reliability.
Top Skills:
AWSAzureGCPGoJavaKubernetesPythonTerraform
Software
As a Site Reliability Engineer, you'll build and maintain large-scale systems, automate operational tasks, and enhance platform reliability while collaborating with teams on SLOs and SLIs.
Top Skills:
AnsibleAWSAzureAzure Resource ManagerChefCloudFormationDatadogElkGCPGoGoogle Deployment ManagerGrafanaJavaScriptKubernetesLinuxMongoDBMySQLNewrelicNode.jsNomadOraclePostgresPuppetPythonSaltSplunkTerraform
Software
As an AI Support Engineer, you'll manage support requests, resolve user issues, optimize ML models, and contribute to product development.
Top Skills:
Tensorrt
Artificial Intelligence • Legal Tech • Professional Services • Software
As a Staff Software Engineer in Site Reliability, you'll manage infrastructure for reliability and scalability, lead incident management, and automate operational tasks.
Top Skills:
AWSAzureBashCloudFormationDatadogGCPGoIncidentioPagerdutyPulumiPythonSentryTerraform
Artificial Intelligence • Legal Tech • Professional Services • Software
As a Software Engineer in Site Reliability, you will ensure the reliability and performance of our AI platform through automation and strategic infrastructure management.
Top Skills:
AWSAzureBashCloudFormationDatadogGCPGoKubernetesPagerdutyPythonSentryTerraform
Energy
The Site Reliability Engineer will design and implement systems, drive automation, coordinate between teams, support deployed systems, and ensure scalability for rapid growth.
Top Skills:
Active DirectoryAnsibleAWSAzureChefJSONLinuxPuppetPythonRestVMwareWindows ServerYaml
Information Technology • Legal Tech
The Senior Technology Site Reliability Engineer is responsible for maintaining and optimizing infrastructure and applications, ensuring reliability and performance while automating processes and collaborating with teams.
Top Skills:
AWSChefDatadogGoGrafanaJavaPrometheusPuppetPythonSaltTerraform
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The Principal Staff SRE will lead initiatives in building and optimizing core infrastructure services on-prem and cloud, deploying and managing services at scale, and improving performance with automation and monitoring tools.
Top Skills:
DhcpDnsEbpfGoLdapLinuxNtpPythonTerraformXdp
Top San Francisco Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results




























