Together AI Logo

Together AI

Senior Platform Engineer, Voice AI

Reposted 8 Days Ago
Be an Early Applicant
In-Office
San Francisco, CA, USA
200K-260K Annually
Senior level
In-Office
San Francisco, CA, USA
200K-260K Annually
Senior level
As a Senior Platform Engineer at Together AI, you will build and manage the API and infrastructure layer for a Voice AI platform, focusing on real-time streaming capabilities and developer experience, ensuring reliability for production voice applications.
The summary above was generated by AI
About the Role

Together AI is building the best inference infrastructure for voice applications. Our Voice AI platform powers production-grade, real-time voice agents and applications — serving speech-to-text and text-to-speech models with best-in-class latency and reliability.

We're looking for a Senior Platform Engineer to own the API and infrastructure layer for voice workloads. You'll build the real-time WebSocket and HTTP APIs that developers use to ship voice experiences, design autoscaling for latency-sensitive streaming workloads, and ensure our multi-provider voice platform is reliable enough for production voice agents handling millions of calls.

This is a foundational hire on a small, high-impact team. Voice APIs have fundamentally different infrastructure requirements than text-based inference — bidirectional audio streaming, stateful connections, tight latency SLOs, and complex multi-model routing. You'll define how developers interact with Together's voice platform as we grow from early customers to the default infrastructure for voice AI.

  • Own the real-time API layer (WebSocket + HTTP streaming) that powers Together's voice platform.
  • Design autoscaling and orchestration for voice workloads running on tens of thousands of GPUs.
  • Build the developer experience — APIs, observability, and tooling — for a fast-growing product area.
  • Work with production voice customers (contact centers, AI agents, communication platforms) to ship what they actually need.
  • Join a small, early-stage team with outsized impact on a new product line.
Responsibilities
  • Build and harden real-time WebSocket and HTTP streaming APIs for STT and TTS — including connection lifecycle management, backpressure, error handling, and reconnection, at the reliability bar needed for production voice agents.
  • Design and ship autoscaling for voice model endpoints that handles bursty, real-time traffic patterns — accounting for concurrent connection limits, streaming state, and hard latency ceilings.
  • Implement voice-specific API features: word-level alignment, speaker diarization in realtime, audio format flexibility (g711/mulaw for telephony, PCM, WebRTC formats), pronunciation controls, and multi-context WebSocket support.
  • Build voice-specific observability — latency breakdowns, audio quality signals, and dashboards that help both the team and customers debug issues.
  • Own multi-model normalization across our model partners (Cartesia, Deepgram, Rime, and others), ensuring consistent API behavior regardless of the underlying provider.
  • Collaborate with the ML engineering side of the team on the interface between the API layer and the model serving stack, ensuring latency and reliability requirements are met end-to-end.
  • Contribute to developer experience — API design, documentation, integration cookbooks, playground and showcasing how best-in-class voice agents are built.
  • Lay the groundwork for multiple new products down the line.
Requirements
  • 5+ years of experience building large-scale, real-time distributed systems and API services.
  • Deep expertise in real-time streaming infrastructure — WebSocket server architecture, Server-Sent Events, bidirectional streaming, connection multiplexing, and stateful protocol design.
  • Expert-level programming in TypeScript and Python; experience with Rust is a plus.
  • Strong distributed systems fundamentals: load balancing, autoscaling, rate limiting, and traffic shaping for latency-sensitive workloads.
  • Experience with Kubernetes — including custom autoscalers, resource management, and health checking for stateful services.
  • Strong product sense — you care about API ergonomics and think about what developers building voice apps actually need.
  • Comfort working on a small, early-stage team where you'll wear multiple hats and move fast.
  • Experience with audio or media protocols (WebRTC, g711, PCM encoding) is a strong plus.
  • Familiarity with ML model serving infrastructure and how inference engines work is a plus — you'll interface with the serving layer regularly.
  • Full-stack experience (React, Next.js) is a nice-to-have for contributing to developer-facing tooling.
  • Bachelor's or Master's degree in Computer Science, Computer Engineering, or related field, or equivalent practical experience.
About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancement such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers and engineers in our journey in building the next generation AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance and other competitive benefits. The US base salary range for this full-time position is: $200,000 - $260,000 + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our privacy policy at https://www.together.ai/privacy  

Together AI San Francisco, California, USA Office

584 Castro St, #2050, San Francisco, California , United States, 94114

Similar Jobs

54 Minutes Ago
In-Office
2 Locations
134K-179K Annually
Expert/Leader
134K-179K Annually
Expert/Leader
Cloud • Information Technology • Machine Learning
Lead sourcing strategy and commercial execution for dark fiber, lit services, wavelengths, and carrier connectivity. Run RFPs and negotiations, translate network engineering requirements into contracts, manage supplier performance and delivery across design, construction, and turn-up, and support cross-functional stakeholders to enable global data center and network expansion.
Top Skills: Contract Lifecycle ManagementDark FiberErpGis ToolsIx ConnectivityKmzLit FiberOsp FiberTelecom Management ToolWavelengths
55 Minutes Ago
In-Office
2 Locations
177K-237K Annually
Senior level
177K-237K Annually
Senior level
Cloud • Information Technology • Machine Learning
Lead electrical design for medium-voltage substations (15kV–230kV) across data center campuses. Own projects from concept through construction, produce SLDs, layouts, protection/control schemes, review power system studies, integrate on-site generation (BESS, turbines, generators), direct consultants, establish MV design standards, and mentor junior engineers. Support feasibility, capacity planning, budgeting, and risk management. Up to 30% travel.
Top Skills: AutocadBessEtapGeneratorsIec 61850InvertersMicrostationParalleling SwitchgearPscadRtuScadaSkmTurbines
An Hour Ago
In-Office
San Jose, CA, USA
144K-216K Annually
Mid level
144K-216K Annually
Mid level
Artificial Intelligence • Fintech • Software
Own product vision and execution for data integrations: define supported integration types and tooling (API, sFTP, CDC, Snowflake), shape integration architecture and data mapping, work with engineering and non-technical customers, create playbooks and documentation, and align with GTM, Sales, and Customer Success to accelerate onboarding and platform value.
Top Skills: APIsChange Data Capture (Cdc)Data PipelinesData WarehousingNetSuiteSAPSftpSnowflakeWorkdayYardi

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account