Kevala Jobs

Staff Site Reliability Engineer

Kevala

Staff Site Reliability Engineer

Reposted 5 Days Ago

In-Office or Remote

Hiring Remotely in San Francisco, CA, USA

136K-180K Annually

Senior level

In-Office or Remote

Hiring Remotely in San Francisco, CA, USA

136K-180K Annually

Senior level

The Staff Site Reliability Engineer will lead in designing and maintaining cloud infrastructure on GCP, drive IaC strategy, manage Kubernetes operations, ensure security compliance, and mentor engineers.

The summary above was generated by AI

As a Staff Site Reliability Engineer, you will be a key technical leader responsible for the architecture, reliability, and security of our entire cloud infrastructure. You will drive technical direction, mentor engineers, and solve our most complex infrastructure challenges as a hands-on contributor.

You will lead the management of our Google Cloud Platform (GCP) environment, drive our Infrastructure as Code (IaC) strategy, and ensure our Kubernetes-based microservices are deployed seamlessly and securely. You will serve as the expert for scalability, observability, and building the robust, automated systems that power Kevala's continuous deployment pipeline.

The applicant must have current, unrestricted work authorization in the United States. This job is not eligible for visa sponsorship.

What you will be doing

Architect & Maintain: Design, build, and maintain our core cloud-native infrastructure on Google Cloud Platform (GCP) following established best practices.
Infrastructure as Code (IaC): Lead our IaC strategy, writing and reviewing high-quality Terraform to manage all cloud resources in a repeatable and version-controlled way.
Kubernetes Operation: Manage and scale our Google Kubernetes Engine (GKE) clusters, including configuration of ingress, and monitoring components.
Champion Security & Compliance: Integrate, implement, and audit security best practices across all infrastructure layers (GCP IAM, GKE policies, network security), ensuring regulatory compliance and leading incident response.
Database Reliability: Manage the provisioning, scaling, and reliability of our Postgres databases (e.g., Cloud SQL) and other data stores.
Observability: Build and refine our monitoring, tracing, logging, and alerting systems (e.g., OpenTelemetry, Grafana, Google Cloud's operations suite) to ensure high availability.
Mentorship and Design: Partner with engineering teams on scalable architecture design. Mentor other engineers on DevOps practices, cloud architecture, and security.

What you need to succeed

Experience: 8+ years in a SRE, DevOps, or Infrastructure Engineering role, with a proven track record of operating in a Staff or similar technical leadership capacity.
Leadership & Communication: Excellent communication skills with the ability to clearly articulate complex technical decisions, mentor team members, and drive projects to completion.
GCP Proficiency: Extensive hands-on experience designing and managing production environments in Google Cloud Platform.
Kubernetes (K8s) Expert: Advanced knowledge of Kubernetes and its ecosystem (GKE preferred), including cluster administration and deployment tooling (e.g., Helm).
Terraform/IaC: Extensive, production-level experience using Terraform to manage complex cloud environments.
Automation: Deep experience with automation tooling and scripting (e.g., Bash, Python, Go) to manage infrastructure and operations at scale.
Database Skills: Experience managing and scaling relational databases like Postgres in a production environment.
Security Implementation & Auditing: Practical experience designing, implementing, and auditing security controls for cloud infrastructure, networks, and applications (e.g., IAM, network security).

The compensation for this opportunity includes a base salary range of $ 136,000 - $ 180,000, plus equity (stock options). This is our target compensation range and is subject to multiple factors, including level, experience, and location. As you go through our interview process, our recruiter will work with you to identify a competitive base salary within the proposed range and combine it with an equity package that reflects your excitement about joining Kevala.

This is a fully remote role which can be located anywhere within the United States. Please note that actual salaries may vary based on factors including, but not limited to, education, experience, and location.

55 Francisco St, San Francisco, California, United States, 94133 2112

Similar Jobs

Bluesky Social

Site Reliability Engineer

5 Days Ago

Remote

USA

200K-270K Annually

Expert/Leader

200K-270K Annually

Expert/Leader

Social Media • Software

Design, implement, and operate infrastructure for a federated social network. Own reliability, availability, observability, incident response, deployments, capacity planning, and cost management. Build automation and tooling, scale bare-metal and cloud systems for millions of users, lead incident reviews, mentor engineers, and manage vendor relationships to ensure operational excellence.

Top Skills: Bare-MetalCapacity PlanningCloud ServicesColocationDatabasesDeployment And Rollback SystemsGoIncident ResponseKubernetesLinuxMonitoringNetworkingObservability SystemsProduction AutomationStorage

SimSpace

Site Reliability Engineer

12 Days Ago

Remote

U.S.

165K-230K Annually

Senior level

165K-230K Annually

Senior level

Information Technology • Security

Lead technical strategy and architecture for SimSpace's infrastructure, evolving CI/CD and multi-cluster Kubernetes platforms using Jsonnet and Grafana Tanka. Define SLIs/SLOs, build observability with the Grafana stack, embed security and compliance into pipelines, enable self-service developer tooling, command major incidents, and mentor engineering teams to improve reliability and scalability across cloud, on-prem, VMware, and air-gapped deployments.

Top Skills: ArgocdCi/CdGithub ActionsGitopsGoGrafanaGrafana TankaJsonnetKubernetesKustomizePythonVMware

Domino Data Lab

Site Reliability Engineer

12 Days Ago

Easy Apply

Remote or Hybrid

Easy Apply

200K-230K Annually

Senior level

200K-230K Annually

Senior level

Artificial Intelligence • Machine Learning

Lead development of AI-assisted reliability tooling, own incident response end-to-end, improve observability and SLO/SLI frameworks, scale single-tenant SaaS operations, mentor engineers, and reduce recurring operational toil through engineering and automation.

Top Skills: Cloud PlatformsGoKubernetesLinuxLlm/Ai ToolingLogs And TracingObservability ToolingPythonSlo/Sli Frameworks

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Kevala

Staff Site Reliability Engineer

What you will be doing

What you need to succeed

Kevala San Francisco, California, USA Office

Similar Jobs

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech