Get the job you really want.
Maximum of 25 job preferences reached.
Top Senior Site Reliability Engineer Jobs in San Francisco, CA
Cloud • Software
Responsible for maintaining FedRAMP-compliant infrastructure, collaborating with software engineers, and ensuring system availability and security. Duties include infrastructure design, automation, monitoring, and incident response.
Top Skills:
AWSGoKubernetesPuppetPythonTerraform
Software
Design, implement, and maintain scalable backend systems and APIs; build cloud infrastructure (preferably GCP) using Terraform; operate containerized workloads with Kubernetes; ensure reliability, security, and performance; participate in on-call rotations, architecture discussions, and cross-functional delivery.
Top Skills:
Ci/CdCloud AutomationContainer OrchestrationGoGoogle Cloud PlatformIamInfrastructure As CodeKubernetesMicroservicesPythonService-Oriented ArchitectureTerraform
Artificial Intelligence • Consumer Web • Digital Media • Information Technology • Social Impact • Software
The Senior Site Reliability Engineer will manage system incidents, improve monitoring and logging, optimize database infrastructure, and collaborate on scaling systems efficiently.
Top Skills:
AWSClickhouseKubernetesMySQLPostgresRedis
Artificial Intelligence • Information Technology • Software
The Site Reliability Engineer will ensure high availability and performance of CodeRabbit's AI-powered code review platform, enhancing system reliability through infrastructure ownership, performance engineering, and automation.
Top Skills:
AWSDatadogDockerElk StackGoogle Cloud PlatformGrafanaKubernetesLinuxNode.jsPrometheusTerraformTypescript
Information Technology • Automation
The SRE/Infrastructure Engineer will architect and manage secure, scalable systems for automated penetration testing, optimizing reliability, and enhancing infrastructure based on customer demand. Responsibilities include maintaining production environments, leading technical discussions, and promoting high coding standards.
Top Skills:
AWSAzureCloudFormationElkGCPNew RelicOpentelemetryPostgresPrometheusTerraform
Reposted 3 Days AgoSaved
Easy Apply
Easy Apply
Artificial Intelligence • Blockchain • Fintech • Financial Services • Cryptocurrency • NFT • Web3
The role involves supporting network infrastructure, automating cloud infrastructure, managing CI/CD workflows, and ensuring operational excellence in IT support, including incident response and security practices.
Top Skills:
AnsibleAWSBashDockerGitKubernetesPythonRubyTerraform
Artificial Intelligence • Software • Generative AI
The Lead Site Reliability Engineer will drive technical strategy, ensure high service availability, manage cloud infrastructure, and lead a team to optimize systems and automate processes.
Top Skills:
AWSAzureDockerGoogle Cloud PlatformKubernetesTerraform
Software
As a Site Reliability Engineer, you'll build and maintain infrastructure for ML models, automate processes, and collaborate cross-functionally.
Top Skills:
Circle CiCloudFormationElk StackGithub ActionsGitlab CiGrafanaJenkinsKubernetesOpentelemetryPrometheusPulumiTerraform
Artificial Intelligence • Software
The SRE at Fluidstack is responsible for ensuring infrastructure reliability and performance, handling complex production issues, and improving platform stability.
Top Skills:
AnsibleBashGoKubernetesPythonSlurmTerraform
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills:
AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
Information Technology • Mobile • Software
As a Site Reliability Engineer, you'll ensure system reliability and scalability, automate processes, optimize performance, and collaborate on system design.
Top Skills:
AWSAzureBashCloudFormationDatadogDockerElkGoGoogle Cloud PlatformGrafanaHelmKubernetesNew RelicPrometheusPulumiPythonTerraform
Blockchain • Web3
As a Site Reliability Engineer, you'll enhance observability, logging, and tracing, collaborating with engineers to optimize performance and security of infrastructure.
Top Skills:
AnsibleAWSAws CdkGCPGitGoGrafanaKubernetesLgtmLokiMimirOpentelemetryPrometheusRustSentryTempoTerraformTypescriptWebassembly
New
Track Smarter, Apply Better.
Ditch the spreadsheets. Organize your job search with our freeApplication Tracker.
Use For Free
Software • Cryptocurrency
Manage and scale Kubernetes clusters, automate infrastructure, optimize performance, maintain blockchain nodes, and improve system reliability while collaborating with product teams.
Top Skills:
Aws (Ec2Aws EksDatadogDockerIam)KubernetesOpentelemetryPulumiRdsS3Terraform
Cloud • Information Technology
As a Staff Site Reliability Engineer, you will enhance cloud product lines, ensuring real-time scalability, collaborating with teams, and automating builds.
Top Skills:
AnsibleAWSAzureBashDnsDockerEnvoyGCPGitGoGrafanaHaproxyHTTPJenkinsKafkaKubernetesLinuxMySQLOciOpentelemetryPostgresPrometheusPuppetPythonRedisTcp/IpTelegrafTerraformTls
Healthtech • Information Technology • Telehealth
Lead Site Reliability Engineer responsible for ensuring cloud services reliability, automation, and performance while mentoring a team and collaborating cross-functionally. Drive initiatives to enhance incident management and enforce security compliance.
Top Skills:
AnsibleAWSAws CloudformationAzureBashDatadogDockerElk StackGoGCPGrafanaKubernetesPrometheusPuppetPythonTerraform
Information Technology • Consulting
The Site Reliability Engineer will drive the observability roadmap, standardize monitoring practices, optimize alerting tools, and collaborate with teams to enhance operational efficiency and system reliability.
Top Skills:
.NetAsp.Net CoreAWSAzureC#DatadogDockerGCPGrafanaKubernetesNew RelicPowershellPrometheusReactSplunkWeb Apis
Cloud
The Site Reliability Engineer will manage Kubernetes platforms, optimize AWS cloud infrastructure, ensure high availability, and automate deployment while handling troubleshooting and security compliance.
Top Skills:
AWSBashCi/CdCloudwatchElk StackGoGrafanaHelmIstioKubernetesPrometheusPythonTerraform
Artificial Intelligence • Software
As a Senior Staff SRE Tech Lead, you'll oversee reliability and scalability, mentor engineers, optimize systems, and enhance data infrastructure.
Top Skills:
ClickhouseGoPostgresPythonTypescript
Cloud • Information Technology
As a Staff Platform Engineer, you'll develop and maintain infrastructure components using Go and Node.js, improve service reliability, mentor juniors, and manage data ecosystems.
Top Skills:
EnvoyExpressGoJenkinsKafkaMySQLNode.jsPostgresPuppetPythonReactRedis
Sports
Manage and improve the AWS infrastructure, deploy into new regions, monitor releases, and implement new technologies in a fast-paced environment.
Top Skills:
AWSDockerGrafanaKubernetesPrometheusPython
Cloud • Security • Software
As a Site Reliability Engineer, you'll design and optimize cloud infrastructure, automate compliance, manage Kubernetes, and maintain reliability in regulated environments.
Top Skills:
Ci/Cd PipelinesDockerGoogle Cloud Platform (Gcp)KubernetesNist 800-53
Software • Analytics
The role involves automating and managing AWS infrastructure, ensuring reliability and scalability of stateful systems, and optimizing deployment processes. You'll also handle incident responses and improve operational tooling.
Top Skills:
AWSKubernetesTerraformTerragrunt
Cloud • Security • Software • Cybersecurity
As a Site Reliability Engineer, you will troubleshoot production issues, automate systems, define database requirements, and collaborate with Dev and QA teams for stability.
Top Skills:
AnsibleCassandraChefNoSQLPythonRedis
Artificial Intelligence • Software
As Staff SRE Tech Lead, you'll oversee platform reliability and scalability, lead the SRE team, architect data infrastructures, and optimize systems while implementing automation and observability practices.
Top Skills:
ClickhouseGoPostgresPythonTypescript
AdTech • Marketing Tech • Analytics
As a Staff Software Engineer - SRE, you'll manage cloud infrastructure, improve application reliability, collaborate across teams, and support back-office systems.
Top Skills:
AWSDatadogDockerKafkaKibanaKubernetesLinuxPostgresPythonRdsRedshiftShell/BashSparkTerraform
Top San Francisco Companies Hiring Senior Site Reliability Engineers
See AllPopular Job Searches
All Filters
Total selected ()
No Results
No Results




.png)
.png)







.png)


















