LiteLLM Logo

LiteLLM

DevOps Engineer

Posted 2 Days Ago
In-Office
San Francisco, CA, USA
150K-200K Annually
Mid level
In-Office
San Francisco, CA, USA
150K-200K Annually
Mid level
The role involves managing secure release processes, including using infrastructure tools like Helm and Terraform, investigating test failures, and collaborating with engineering teams to improve release reliability and speed.
The summary above was generated by AI

LiteLLM is the world’s most popular AI Gateway used by the largest companies (Adobe, Netflix, NASA, etc.) in the world to give their developers access to LLMs and adjacent services (MCP’s, Vector Stores, etc.).

Why do companies use LiteLLM Enterprise

Companies use LiteLLM Enterprise once they put LiteLLM into production and need enterprise features like Prometheus metrics (production monitoring) and need to give LLM access to a large number of people with SSO (secure sign on) or JWT (JSON Web Tokens).

What you will be working on

We are hiring an exceptional engineer to own release infrastructure and release security at LiteLLM. This is an opportunity to join us in-person as an early employee and make a large impact at a high growth start-up. You will own a critical part of the company: making sure we can ship secure, reliable releases on a consistent cadence with a high degree of autonomy and ownership.

We work 6 days per week in our SF office, approximately 60 hours per week in total.

We are looking for a software engineer with a strong background in infrastructure, CI/CD, and release engineering. You should be comfortable working across Helm, Terraform, release automation, testing systems, and the developer infrastructure needed to guarantee stable releases. This is a hands-on role.

You should be able to investigate test failures, distinguish real regressions from flaky tests, write Python, fix minor test issues, remove dead tests, and improve the overall reliability of the release pipeline. You should also be able to architect a secure end-to-end release process: how code moves from commit to published artifact, how access is controlled, how secrets are handled, and how we reduce the chance of bad or unauthorized releases.

What you will do
  • Own secure, regular releases for LiteLLM, including 2 nightly releases and 1 stable release, per week.

  • Manage and improve the infrastructure behind our release process, including Helm, Terraform, CI/CD, and other developer systems needed to keep releases stable.

  • Investigate test failures and determine whether they are true regressions, flaky tests, or dead tests that should be fixed or removed.

  • Write Python to fix minor test issues, improve release reliability, and support developer workflows.

  • Architect and implement a secure release process across build, test, approval, and publish steps.

  • Work closely with the engineering team to improve release quality, reduce operational risk, and keep shipping velocity high.

What we're looking for
  • 2+ years of experience in infrastructure engineering, DevSecOps, release engineering, or related systems work.

  • Proficient in Python and comfortable making code changes in test and release systems.

  • Experience with Terraform, Helm, CI/CD systems, and cloud infrastructure.

  • Strong judgment around release reliability, testing, and debugging.

  • Ability to distinguish between real regressions and flaky infrastructure or test behavior.

  • Ability to design secure release processes, including access controls, secrets handling, and safe publishing workflows.

  • Ability to collaborate effectively with engineers across product, infra, and security.

About LiteLLM

LiteLLM (https://github.com/BerriAI/litellm) is a Python SDK, Proxy Server (LLM Gateway) to call 100+ LLM APIs in OpenAI format - [Bedrock, Azure, OpenAI, VertexAI, Cohere] and is used by companies like Rocket Money, Adobe, Twilio, and Siemens.

Similar Jobs

15 Hours Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
180K-210K Annually
Expert/Leader
180K-210K Annually
Expert/Leader
AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
The Principal DevOps Engineer will lead the design and operation of CI/CD pipelines, platform reliability, and compliance for software deployment, while influencing DevOps practices and simplifying operational management across teams.
Top Skills: Apache KafkaAWSDockerDynamoDBGitlab Ci/CdGrafanaKubernetesPrometheusSqsTerraform
16 Days Ago
Hybrid
133K-226K Annually
Mid level
133K-226K Annually
Mid level
Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
The DevOps Engineer designs, implements, and maintains cloud infrastructure, automates processes, streamlines deployments, and improves performance for an on-premises environment.
Top Skills: AnsibleAzure StackBashChefCi/CdCloud InfrastructureDevOpsDockerElk StackGitlabKubernetesLinux/UnixNagiosOpenstackPowershellPrometheusPuppetPythonVMware
16 Days Ago
Hybrid
97K-165K Annually
Mid level
97K-165K Annually
Mid level
Aerospace • Hardware • Information Technology • Security • Software • Cybersecurity • Defense
The DevOps Engineer will automate cloud infrastructure, provide CI/CD, and ensure security while collaborating with teams to enhance services.
Top Skills: AnsibleAzure StackBashChefDockerDocker SwarmElk StackGitlabKubernetesLinux/UnixNagiosOpenstackPowershellPrometheusPuppetPythonRancherVMware

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account