Braintrust Logo

Braintrust

Customer Reliability Manager

Posted 8 Days Ago
Be an Early Applicant
In-Office
San Francisco, CA, USA
Senior level
In-Office
San Francisco, CA, USA
Senior level
Lead and grow a Customer Reliability Engineering team, ensuring high-quality support and reducing friction in deployments and operations across various models.
The summary above was generated by AI
About the company

Braintrust is the AI observability platform. By connecting evals and observability in one workflow, Braintrust gives builders the visibility to understand how AI behaves in production and the tools to improve it.

Teams at Notion, Stripe, Zapier, Vercel, and Ramp use Braintrust to compare models, test prompts, and catch regressions — turning production data into better AI with every release.

About the Role

At Braintrust, exceptional support is one of our most important strategic advantages. Support is part of Engineering at Braintrust and exists to help reduce friction in the deployment and operation of our product. Our customers are developers building LLM-powered applications, and they move fast. We win by helping them move faster.

We’re looking for a manager to build and lead a team of highly senior and knowledgeable Customer Reliability Engineers to provide ambitiously high quality support focused on customer infrastructure. This team is responsible for reducing friction associated with Braintrust's various deployment models (hybrid, BYOC, and SaaS Enterprise). Engineers on this team directly scope and attempt fixes for infrastructure issues, manage high-stakes customer environments, and ensure product reliability across all customer deployment types.

This role blends engineering leadership, deployment expertise, and customer experience. If you love upleveling Senior+ level talent, scaling cutting edge and complex support motions, and reducing pain for developers, we’d love to talk with you.

What You’ll Do
  • Lead and grow a team of Customer Reliability Engineers, delivering reliable, high-touch support across all Braintrust deployment models: hybrid, Bring Your Own Cloud (BYOC), and enterprise SaaS

  • Own the primary after-hours on-call rotation for customer-reported SEV1s, with backup coverage from Customer Solution Architects (CSAs) and Developer Support Engineers.

  • Run incident response and escalation, including enabling customer infrastructure teams while jumping in hands-on for the highest-severity issues.

  • Own day-to-day tickets tied to deployments, upgrades, and performance troubleshooting.

  • Triage and scope deployment-related feature requests and bug reports, attempt fixes when feasible, and route custom work to Professional Services when needed.

  • Lead new BYOC deployments and upgrades.

  • Respond to high-severity alerts for BYOC customers.

  • Validate each new data plane release against the standard hybrid deployment, and partner with Docs to ship upgrade guidance alongside the changelog.

  • Coach and mentor the team on infrastructure debugging, deployment best practices, and strong customer ownership.

  • Synthesize customer feedback and operational trends for Product and Engineering to improve reliability and reduce recurring pain points.

You Might Be a Fit If You
  • Have 5–10+ years of experience leading support for developer-facing products.

  • Deeply familiar with deploying Terraform, Helm, and Kubernetes based infrastructure across major cloud providers.

  • Are comfortable reviewing, debugging, and reasoning about backend services, infrastructure, and deployment configurations.

  • Take ownership of customer-impacting issues end-to-end, ensuring accountability, follow-through, and continuous improvement.

  • Communicate clearly and empathetically, especially when navigating ambiguity or high-stakes customer situations.

  • Are deeply curious about LLM use cases and excited to lead teams building cutting edge support systems for AI products that are measurable, reliable, and trustworthy.

Bonus Points For
  • Familiarity with OpenAI, Anthropic, or similar LLM providers at a systems or integration level.

  • Experience guiding teams working with datasets, evaluation metrics, or prompt engineering.

  • A track record of building or scaling support tooling, documentation programs, or product-led growth initiatives.

  • Experience as a senior technical leader or tech lead in a high-growth startup environment.

  • History of partnering hands on with Engineering on production fixes for backend services, SDKs, or infrastructure.

  • Experience leading support for products with self-hosted offerings (e.g., Terraform, Kubernetes) and comfort leading incident response involving customer owned containerized environments.

Benefits include
  • Medical, dental, and vision insurance

  • Daily lunch, snacks, and beverages

  • Flexible time off

  • Competitive salary and equity

  • AI Stipend

Equal opportunity

Braintrust is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

HQ

Braintrust San Francisco, California, USA Office

1 Main St, San Francisco, CA, United States, 94105

Similar Jobs

8 Days Ago
In-Office
San Francisco, CA, USA
Senior level
Senior level
Artificial Intelligence • Software • Database • Analytics
Lead and manage a team of Customer Reliability Engineers to provide high-quality support for customer deployments, focusing on infrastructure issues and ensuring product reliability across all models.
Top Skills: HelmKubernetesTerraform
13 Hours Ago
Hybrid
159K-229K Annually
Mid level
159K-229K Annually
Mid level
eCommerce • Healthtech • Pet • Retail • Pharmaceutical
The Data Scientist III develops machine learning infrastructure, creates algorithms, performs analyses to improve products, and communicates insights to management.
Top Skills: AWSDockerGitKerasPysparkRSnowflakeSparkTableauTensorFlow
13 Hours Ago
Hybrid
120K-190K Annually
Mid level
120K-190K Annually
Mid level
eCommerce • Healthtech • Pet • Retail • Pharmaceutical
As a Software Engineer II, you will design, develop, and maintain software systems for digital advertising, collaborating with cross-functional teams and mentoring junior engineers.
Top Skills: AWSAzureDockerGCPGoJavaKafkaKubernetesNoSQLPythonRabbitMQSQLTypescript

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account