LeanData Logo

LeanData

Senior Site Reliability Engineer

Reposted 17 Days Ago
In-Office
Santa Clara, CA, USA
140K-180K Annually
Senior level
In-Office
Santa Clara, CA, USA
140K-180K Annually
Senior level
Lead the modernization of AWS cloud infrastructure, implement automation, ensure system reliability, and manage performance with a focus on security and incident response.
The summary above was generated by AI

LeanData helps the world’s fastest-growing companies automate, simplify, and accelerate revenue.

We are looking for a Senior Site Reliability Engineer to lead the strategic evolution of our cloud infrastructure. Reporting directly to the SVP of Engineering, this role is designed for a builder - someone who wants to move beyond maintenance and into the realm of architectural transformation.

You will have the autonomy to evaluate our existing AWS footprint and lead the charge in modernizing our environment. Your mission is to take a high-velocity system and implement the best practices, guardrails, and automated architectures that will support our next 10x of scale. You will be the primary authority on reliability, performance, and infrastructure security.

Please note: This is a hybrid role based in our Santa Clara, CA office, with an in-office schedule of two days per week – Monday and Wednesday.

Key Responsibilities
  • Architectural Modernization: Lead the design and implementation of a scalable, "Cloud-First" AWS architecture. You will drive the transition toward fully automated, state-of-the-art Infrastructure as Code (Terraform).

  • High Availability & Resilience: Design and implement robust Disaster Recovery (DR) and Business Continuity plans, moving our services toward a zero-downtime deployment model.

  • Performance & Capacity Engineering: Own the strategy for capacity planning and autoscaling. You will optimize our compute resources (EC2, Lambda) to handle bursty traffic patterns with precision and cost-efficiency.

  • Advanced Observability: Define our monitoring and alerting philosophy using New Relic for deep APM and system insights. Partner this with IncidentIO to ensure we catch and resolve issues before they impact customers.

  • Streamlined CI/CD: Partner with feature teams to refine Change Management and CI/CD pipelines, ensuring code moves from "commit" to "production" safely and predictably.

  • Cloud Security: Harden our network architecture and application security posture, including WAF management and secure service-to-service communication.

The Tech Stack
  • Cloud Infrastructure: AWS (EC2, Lambda, SQS, SNS, ALB, API Gateway, S3, WAF).

  • Observability & Incident Response: New Relic (APM/Infrastructure), IncidentIO.

  • Automation & Tools: Terraform, Redis/Elasticache, Shell Scripting, NPM/PM2.

  • Application Ecosystem: NodeJS, Python, C#, Angular, Apex.

  • Integration: Salesforce Managed Packages, MSFT Dynamics365.

Who You Are
  • Experienced Architect: 5+ years of experience in SRE, DevOps, or Systems Engineering, with a proven track record of managing complex AWS environments.

  • Proven Incident Commander: You demonstrate calm, decisive leadership during high-pressure outages. You have extensive experience running blameless postmortems and, crucially, driving the remediation work needed to prevent recurrence.

  • Observability Pro: You have deep experience configuring New Relic (or similar platforms) to create meaningful dashboards, SLIs, and SLOs.

  • Automation Advocate: You believe that manual intervention is a bug. You have deep experience with Terraform and a "Code-First" approach to infrastructure.

  • Strategic Problem Solver: You can look at a complex, "needs-based" architecture and formulate a clear, prioritized roadmap to move it toward industry best practices.

  • Collaborative Leader: You enjoy working with feature engineers to help them build "reliability-by-design" into their services.

  • Education: A Bachelor’s degree in Computer Science, Engineering, or a related technical field (or equivalent professional experience).

Why work at LeanData:

  • LeanData covers employee insurance premiums up to 90%

  • Stock options in LeanData for all full-time employees

  • Flexible PTO

  • 401K plan

HQ

LeanData Santa Clara, California, USA Office

2901 Patrick Henry Drive, Santa Clara, CA, United States, 95054

Similar Jobs

7 Days Ago
Hybrid
San Francisco, CA, USA
160K-250K Annually
Senior level
160K-250K Annually
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Lead design and delivery of scalable cloud infrastructure for the Spend product. Embed with development teams to drive reliability, performance, observability, incident response, and automation. Own SLOs, runbooks, DevOps metrics, and collaborate with central DevOps and security teams to ensure compliance and resilience. Lead infrastructure projects including new service launches, data centre migrations, and modernising data pipelines.
Top Skills: Analytics PipelinesAWSData StreamingDevOpsGCPIncident ResponseKubernetesObservabilitySlosSre
16 Days Ago
Easy Apply
Hybrid
5 Locations
Easy Apply
210K-270K Annually
Senior level
210K-270K Annually
Senior level
Healthtech • Information Technology • Software • Telehealth
Lead reliability efforts for Zocdoc's cloud-based, consumer-facing services: monitor and maintain production systems, automate tooling and infrastructure, support scaling and performance, debug production incidents, and work with product teams to improve uptime and reliability.
Top Skills: AWSDistributed SystemsDnsDockerGCPGenaiHTTPHttpsKubernetesLoad BalancerMicroservicesNtpReverse ProxyTcp/IpTlsWeb Application Firewall
2 Days Ago
Remote or Hybrid
United States
175K-200K Annually
Senior level
175K-200K Annually
Senior level
eCommerce • Fintech • Payments • Software
The role involves ensuring software reliability and performance, managing incidents, developing infrastructure automation, and mentoring junior engineers within a platform team.
Top Skills: AWSCloudFormationDatadogKubernetesOpentelemetryRubyRuby On RailsTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account