ML Engineer, TTS Systems
Location: San Francisco, CA or Remote (US)
About Bland
At Bland.com, we empower enterprises to build and scale AI phone agents. As a fast-growing team in San Francisco, our mission is to advance customer interactions with businesses through natural, reliable, and highly human-like voice technologies. Backed by $65M in funding from leading Silicon Valley investors, including Emergence Capital, Scale Venture Partners, Y Combinator, and founders of Twilio, Affirm, and ElevenLabs.
The Role: ML Engineer, TTS Systems
As an ML Engineer focused on Text To Speech (TTS), you will own the deployment, optimization, and maintenance of our production TTS systems. Your work will transform advanced research models into highly performant, scalable, and robust real-world solutions serving millions of real-time voice interactions daily. You will collaborate with research and engineering teams to implement inference-optimized TTS models, streamline deployment processes, and monitor live systems to ensure best-in-class performance for enterprise clients.
What You Will Do
Deploy and optimize large-scale TTS models into production environments for reliable, low-latency inference.
Implement and refine post training techniques (Like DPO, GRPO, and RLHF) and other modern inference techniques to maximize throughput and audio quality.
Collaborate with cross-functional teams to ensure seamless rollout, A/B testing, and iterative improvement of production models.
Maintain high availability and scalable infrastructure for multi-speaker, expressive, and controllable TTS use cases.
Design and document best practices for efficient TTS inference and system reliability.
What Makes You a Great Fit
Hands-on experience deploying large-scale neural TTS models in cloud or on-prem production settings.
Deep expertise in TTS inference optimization (e.g., quantization, kernel optimization, batching strategies, GRPO).
Strong understanding of real-time, low-latency audio processing pipelines and their challenges.
Working knowledge of distributed systems, GPU acceleration, and scalable production infrastructure.
Ability to diagnose and resolve quality, performance, and reliability issues in deployed voice systems.
Comfortable working in fast-paced, startup environments and taking full ownership from deployment through system maintenance.
Bonus Points
Contributions to open-source TTS systems or production audio frameworks.
Prior work in telephony, streaming, or live enterprise communication environments.
Benefits and Compensation
Healthcare, dental, vision
Meaningful equity in a fast-growing company
Every tool you need to succeed
Beautiful office in Jackson Square, SF with rooftop views
Competitive salary: $160,000 to $250,000
If you’re passionate about scaling production TTS systems, driving inference excellence, and delivering seamless, human-like voice at scale, we want to hear from you.
Top Skills
Similar Jobs
What you need to know about the San Francisco Tech Scene
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine
