At HUD, we’re building the future of how companies and individuals train and evaluate AI. We believe that in the near future, most post-training data used to align and improve LLMs will flow through HUD.
We build a platform and developer tools that let teams create post-training data through RL environments and run reinforcement fine-tuning (RFT) reliably, reproducibly, and at scale.
We’re trusted by foundation labs, Fortune 500s, and fast-growing startups.
We’re also a high-caliber team: former founders, published ML researchers, Olympiad medalists, and engineers who have built products with real adoption. We run lean, move fast, and hold an extremely high bar.
The RoleWe’re hiring a Founding Engineer to be a core contributor across our platform, SDK, and developer experience. This is a high-ownership role at the center of the product: you’ll build primitives, APIs, and workflows that ML engineers and researchers use daily to create data, run post-training, and evaluate systems.
You’ll operate across the stack—from Python SDK design to backend systems and infrastructure—while staying relentlessly product-minded. You’ll also work directly with customers (engineers and researchers at startups, enterprises, and frontier labs) to understand their workflows, unblock them, and ship what they actually need.
If you’ve ever cared deeply about how ML tools feel to use—ergonomics, reliability, abstractions, and “it just works” workflows—this role is for you.
Who You AreProduct owner with elite engineering tasteYou don’t just build features—you build products. You can hold a cohesive vision for developer experience, make sharp tradeoffs, and ship workflows that feel obvious in hindsight.
Full stack, Python-first, systems-capableYou’re deeply fluent in Python and comfortable across the stack: backend APIs, storage, distributed systems, and front-end surfaces when needed. You understand containers and runtime environments at a real level—Docker isn’t magic to you.
ML dev tools instinct + high technical aptitudeYou have unusually strong technical intuition. You spot leaky abstractions, design APIs that age well, and build tooling that makes power users faster. You care about correctness, performance, and sharp interfaces.
Comfortable with ML fundamentals (RL a big plus)You don’t need to be an academic, but you should be able to reason about core ML concepts and math. Ideally you’ve trained models, worked with post-training workflows, or understand reinforcement learning well enough to build tools around it.
Evals mindsetYou care about evaluation as a first-class product surface. Experience with Inspect or other evaluation harnesses/frameworks is a major plus, as is building eval and training pipelines for LLMs, agents, or multimodal systems.
AI-native builderYou’re proficient with modern AI coding tools and agentic workflows (Cursor, Copilot, Claude/ChatGPT-style assistants, eval-driven iteration, etc.). You move fast without sacrificing rigor.
Customer-minded engineerYou can jump into a customer’s environment, understand what’s broken, propose a solution, and deliver it—while also feeding insights back into the roadmap. You can communicate clearly with both engineers and researchers.
What You’ll DoBuild the core HUD platformDesign and implement backend systems for post-training workflows (RFT jobs, dataset/data flow primitives, run tracking, artifacts, permissions).
Build reliable execution and orchestration primitives with strong isolation and reproducibility.
Build and iterate on our Python SDK: clean APIs, excellent docs, great errors, sharp defaults, and extensibility.
Create “golden path” workflows for common user goals: creating post-training data, launching RFT runs, evaluating results, and iterating quickly.
Help build eval pipelines for LLMs/agents that connect naturally to post-training loops (capability measurement → data creation → training → re-eval).
Build with Docker, Linux, and cloud infrastructure in mind; ensure consistent environments across local, CI, and production.
Improve performance, observability, and debuggability of job execution and data pipelines.
Contribute to Kubernetes deployment patterns and scaling.
Partner with engineers and researchers at startups, enterprises, and foundation labs.
Turn messy real-world feedback into product improvements: better abstractions, missing primitives, clearer docs, smoother onboarding.
Required
Strong production experience in Python.
Comfort across the stack (APIs, data systems, frontend integration where needed).
Deep understanding of Docker and Linux environments; strong debugging ability.
Cloud competence (K8s and AWS fundamentals—compute, networking, storage, IAM).
Strong product instincts and a bias toward shipping.
Ability to write clean, maintainable code with strong taste in interfaces and DX.
Strong Plus
Experience with reinforcement learning, post-training, or model training workflows.
Experience building or using LLM/agent eval frameworks (Inspect, EleutherAI tooling, custom harnesses).
Experience designing SDKs, CLIs, or developer platforms.
Kubernetes experience (deployment, scaling, job orchestration).
Active participation in the ML community (open-source contributions, writing, research engagement, etc.).
You own product features end to end, working autonomously with the team to improve core customer experiences.
The SDK feels intuitive and inevitable: sharp, consistent, well-documented, hard to misuse.
The platform is dependable under real-world load: reproducible runs, fast performance, clear logs, great debugging, smooth onboarding.
Customers feel like HUD gives them leverage—your work directly drives adoption and retention.
High talent density, low ego. You’ll work with unusually strong peers who care about craft.
Real ownership. You’ll shape the core product and its direction at a pivotal stage.
Hard problems with real impact. Post-training, evals, and RL workflows are still early—your work defines best practices.
Move fast, build right. We care about speed and quality, and we invest in doing things correctly.
Locations: San Francisco / Singapore
Type: Full-time
Visa/Relocation: Available for strong candidates (US/Singapore)
Compensation: $150,000-$240,000 salary, meaningful equity, full healthcare, daily team meals.
Top Skills
hud San Francisco, California, USA Office
San Francisco, CA, United States, 94109
Similar Jobs
What you need to know about the San Francisco Tech Scene
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

