hud Logo

hud

Founding Engineer (Full Stack, ML DevTools & Systems)

Posted 7 Days Ago
In-Office
San Francisco, CA
150K-240K Annually
Mid level
In-Office
San Francisco, CA
150K-240K Annually
Mid level
The Founding Engineer will design and implement ML platforms and tools, creating APIs and SDKs, while also improving customer workflows and experiences.
The summary above was generated by AI
Founding Engineer (Full Stack, ML DevTools & Systems)

About HUD

At HUD, we’re building the future of how companies and individuals train and evaluate AI. We believe that in the near future, most post-training data used to align and improve LLMs will flow through HUD.

We build a platform and developer tools that let teams create post-training data through RL environments and run reinforcement fine-tuning (RFT) reliably, reproducibly, and at scale.

We’re trusted by foundation labs, Fortune 500s, and fast-growing startups.

We’re also a high-caliber team: former founders, published ML researchers, Olympiad medalists, and engineers who have built products with real adoption. We run lean, move fast, and hold an extremely high bar.

The Role

We’re hiring a Founding Engineer to be a core contributor across our platform, SDK, and developer experience. This is a high-ownership role at the center of the product: you’ll build primitives, APIs, and workflows that ML engineers and researchers use daily to create data, run post-training, and evaluate systems.

You’ll operate across the stack—from Python SDK design to backend systems and infrastructure—while staying relentlessly product-minded. You’ll also work directly with customers (engineers and researchers at startups, enterprises, and frontier labs) to understand their workflows, unblock them, and ship what they actually need.

If you’ve ever cared deeply about how ML tools feel to use—ergonomics, reliability, abstractions, and “it just works” workflows—this role is for you.

Who You AreProduct owner with elite engineering taste

You don’t just build features—you build products. You can hold a cohesive vision for developer experience, make sharp tradeoffs, and ship workflows that feel obvious in hindsight.

Full stack, Python-first, systems-capable

You’re deeply fluent in Python and comfortable across the stack: backend APIs, storage, distributed systems, and front-end surfaces when needed. You understand containers and runtime environments at a real level—Docker isn’t magic to you.

ML dev tools instinct + high technical aptitude

You have unusually strong technical intuition. You spot leaky abstractions, design APIs that age well, and build tooling that makes power users faster. You care about correctness, performance, and sharp interfaces.

Comfortable with ML fundamentals (RL a big plus)

You don’t need to be an academic, but you should be able to reason about core ML concepts and math. Ideally you’ve trained models, worked with post-training workflows, or understand reinforcement learning well enough to build tools around it.

Evals mindset

You care about evaluation as a first-class product surface. Experience with Inspect or other evaluation harnesses/frameworks is a major plus, as is building eval and training pipelines for LLMs, agents, or multimodal systems.

AI-native builder

You’re proficient with modern AI coding tools and agentic workflows (Cursor, Copilot, Claude/ChatGPT-style assistants, eval-driven iteration, etc.). You move fast without sacrificing rigor.

Customer-minded engineer

You can jump into a customer’s environment, understand what’s broken, propose a solution, and deliver it—while also feeding insights back into the roadmap. You can communicate clearly with both engineers and researchers.

What You’ll DoBuild the core HUD platform
  • Design and implement backend systems for post-training workflows (RFT jobs, dataset/data flow primitives, run tracking, artifacts, permissions).

  • Build reliable execution and orchestration primitives with strong isolation and reproducibility.

Own the SDK and developer experience
  • Build and iterate on our Python SDK: clean APIs, excellent docs, great errors, sharp defaults, and extensibility.

  • Create “golden path” workflows for common user goals: creating post-training data, launching RFT runs, evaluating results, and iterating quickly.

Ship eval-native workflows
  • Help build eval pipelines for LLMs/agents that connect naturally to post-training loops (capability measurement → data creation → training → re-eval).

Go deep on systems and reliability
  • Build with Docker, Linux, and cloud infrastructure in mind; ensure consistent environments across local, CI, and production.

  • Improve performance, observability, and debuggability of job execution and data pipelines.

  • Contribute to Kubernetes deployment patterns and scaling.

Work directly with customers
  • Partner with engineers and researchers at startups, enterprises, and foundation labs.

  • Turn messy real-world feedback into product improvements: better abstractions, missing primitives, clearer docs, smoother onboarding.

What We’re Looking For (Skills & Experience)

Required

  • Strong production experience in Python.

  • Comfort across the stack (APIs, data systems, frontend integration where needed).

  • Deep understanding of Docker and Linux environments; strong debugging ability.

  • Cloud competence (K8s and AWS fundamentals—compute, networking, storage, IAM).

  • Strong product instincts and a bias toward shipping.

  • Ability to write clean, maintainable code with strong taste in interfaces and DX.

Strong Plus

  • Experience with reinforcement learning, post-training, or model training workflows.

  • Experience building or using LLM/agent eval frameworks (Inspect, EleutherAI tooling, custom harnesses).

  • Experience designing SDKs, CLIs, or developer platforms.

  • Kubernetes experience (deployment, scaling, job orchestration).

  • Active participation in the ML community (open-source contributions, writing, research engagement, etc.).

What Success Looks Like
  • You own product features end to end, working autonomously with the team to improve core customer experiences.

  • The SDK feels intuitive and inevitable: sharp, consistent, well-documented, hard to misuse.

  • The platform is dependable under real-world load: reproducible runs, fast performance, clear logs, great debugging, smooth onboarding.

  • Customers feel like HUD gives them leverage—your work directly drives adoption and retention.

Why You’ll Love It Here
  • High talent density, low ego. You’ll work with unusually strong peers who care about craft.

  • Real ownership. You’ll shape the core product and its direction at a pivotal stage.

  • Hard problems with real impact. Post-training, evals, and RL workflows are still early—your work defines best practices.

  • Move fast, build right. We care about speed and quality, and we invest in doing things correctly.

Logistics
  • Locations: San Francisco / Singapore

  • Type: Full-time

  • Visa/Relocation: Available for strong candidates (US/Singapore)

  • Compensation: $150,000-$240,000 salary, meaningful equity, full healthcare, daily team meals.

Top Skills

AWS
Docker
Kubernetes
Linux
Python
HQ

hud San Francisco, California, USA Office

San Francisco, CA, United States, 94109

Similar Jobs

56 Minutes Ago
Remote or Hybrid
USA
125K-180K Annually
Senior level
125K-180K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Lead Program Manager for the CTO Labs team, managing large scale cross-functional projects, mentoring Project Managers, and ensuring timely project delivery.
Top Skills: AgileJIRAProject Management MethodologiesScaled Agile FrameworkScrum
56 Minutes Ago
Hybrid
Sunnyvale, CA, USA
140K-215K Annually
Senior level
140K-215K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Senior Backend Engineer will develop cloud-based systems for cybersecurity, focusing on scalability and reliability, using languages like Go and Python. Collaboration and mentorship are key components.
Top Skills: AWSAzureCassandraElastic SearchGoKafkaKubernetesLinuxPythonRedis
56 Minutes Ago
Remote or Hybrid
USA
210K-300K Annually
Senior level
210K-300K Annually
Senior level
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
The Assistant General Counsel for Information Governance will provide legal guidance on governance related to AI, data, and information management, ensuring compliance in a SaaS environment, and collaborating with technical teams.
Top Skills: AICybersecurityData GovernanceEnterprise Information SystemsInformation Governance

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account