Hilbert's AI Logo

Hilbert's AI

Lead ML Engineer / Data Scientist

Reposted 5 Days Ago
Hybrid
San Francisco, CA, USA
Senior level
Hybrid
San Francisco, CA, USA
Senior level
Lead and own the data science function: design and build production ML systems (recommendation, forecasting, segmentation), develop configurable multi-tenant models, run rigorous experimentation and causal analyses, connect model outputs to business impact, set scientific standards, and hire and mentor a high-velocity data science team.
The summary above was generated by AI
Hilbert is building the ML systems that power demand intelligence for the world's largest consumer companies — recommendation engines, demand forecasting, customer lifecycle models, and activation systems that must work across wildly different retailers, data environments, and business contexts. This isn't single-tenant model building; it's designing configurable, production-grade ML architectures that generalize across Fortune 500 enterprises and beloved consumer brands alike.

We're looking for a Lead ML Engineer who thinks in systems, understands B2C business problems deeply, and can build the models and pipelines that power real growth outcomes — all with the ownership and urgency of a founder.

This is not a "build a model in a notebook and hand it off" role. You'll own the entire ML function — from problem framing through model development through production deployment through business impact — and you'll do it for enterprise customers where the stakes are real and the feedback loop is tight. If you understand why a recommender system matters to a retailer's P&L, can design a configurable ML system that works across customers without being rebuilt from scratch, and can explain causal impact to a room of executives with clarity and conviction, we want to meet you.

Why Hilbert AI

Hilbert is building the demand intelligence platform used by world-class B2C leaders — including the world's largest retailer — to unlock compounding growth outcomes. We sit at the intersection of AI, data, and commercial activation for retail and e-commerce.

We're scaling fast with top-tier investors behind us. ML systems are the engine behind what we deliver to customers — which means every model, every pipeline, every system you build has direct, measurable impact on enterprise revenue. We're a small, talent-dense, low-ego team. We value ownership, speed, intellectual honesty, and shipping real impact.

The Role

You'll work directly with the founding team and across engineering, product, and GTM to define, build, and scale the ML systems at the heart of Hilbert. You'll be hands-on daily — building models, designing pipelines, interrogating data, and shipping to production — but you'll also set the scientific direction, establish rigor, and grow the team. B2C is our world. The problems we solve — demand prediction, customer lifecycle, personalization, activation — require someone who understands these domains deeply and can translate business context into model design and engineering decisions. The environment is high-autonomy and high-ambiguity. Data is often messy, incomplete, or limited. You thrive in exactly those conditions.

Our Current Hurdles

These are the kinds of problems you'll walk into on day one — and you'll be the one setting the strategy for how we solve them.

  • Multi-tenant ML architectures that actually generalize — we serve enterprises with fundamentally different data shapes, catalog sizes, customer behaviors, and business constraints. The challenge is designing model architectures and pipelines that are configurable and adaptive across customers — not rebuilding bespoke systems for every account. You'll define the abstraction boundaries and decide what's shared versus customer-specific.

  • Extracting real signal from messy, limited data — enterprise data is never clean and rarely complete. Cold-start problems, sparse interaction histories, inconsistent taxonomies, missing features — this is the norm, not the exception. You'll set the modeling philosophy for how we build reliable systems when the data fights back.

  • Connecting model outputs to business actions — a recommendation score or a demand forecast is worthless if it doesn't change what an operator actually does. The hurdle is closing the loop between ML outputs and real commercial decisions — activation, merchandising, retention — in a way that's measurable and defensible. You'll own how models translate into impact, not just accuracy.

  • Causal rigor in a world that wants quick answers — enterprise customers want to know why something is happening, not just what. The challenge is building causal inference into our systems and analyses in a way that's rigorous but practical — knowing when an A/B test is sufficient, when you need difference-in-differences or synthetic controls, and when the honest answer is "we can't know yet." You'll set the bar for analytical integrity.

What you'll do:

Build — hands-on, every day

  • Design, build, and deploy ML models and pipelines that power core product capabilities: recommendation systems, search relevance, customer segmentation, demand forecasting, and activation optimization

  • Develop configurable, multi-tenant model architectures that adapt to different customer contexts, data availability, and business requirements without being rebuilt from scratch

  • Engineer production-grade ML systems — not just prototypes. You own model serving, monitoring, retraining, and the infrastructure that keeps models reliable at scale

  • Create meaningful models with the data that's actually available — not the data you wish you had. You know how to extract signal from limited, noisy, or sparse datasets

  • Design and run rigorous A/B tests and experimentation frameworks — including understanding when A/B testing is insufficient and causal inference methods are required

  • Deliver analyses that drive decisions — not dashboards that collect dust. You connect model outputs to business outcomes and communicate them with clarity

  • Apply causal reasoning rigorously — you know the difference between correlation and causation, you design analyses that surface true drivers, and you flag when others confuse the two

Lead — set direction and raise the bar

  • Define and own the ML roadmap in partnership with the founding team

  • Think in systems. You don't build isolated models — you design interconnected systems where recommendation, segmentation, scoring, and activation reinforce each other. You see how the pieces fit together and where leverage exists

  • Frame business problems as ML problems — and know when a simpler approach beats a complex model

  • Set engineering and scientific standards — validation methodology, experiment design, code quality, reproducibility, and deployment discipline

  • Prioritize across competing demands, keeping the team focused on highest-impact work

  • Communicate results, tradeoffs, and strategic recommendations clearly to founders, customers, and non-technical stakeholders

  • Be the tiebreaker on methodology and architecture — when the team debates approaches, you bring clarity

Grow — build the team and the culture

  • Hire, mentor, and develop ML engineers and data scientists as the team scales

  • Create an environment of scientific rigor without academic slowness — ship, validate, iterate

  • Build processes that work at startup speed — reviews and checkpoints that improve quality without killing velocity

  • Identify capability gaps and build the team to fill them

  • Lead by example: the team sees you in the data, in the code, in the hard problems — not just in planning docs

Who You Are

We care about how you think about problems, how you connect models to business impact, and how you make others around you sharper.

The profile:
  • You're an ML engineer who ships to production. You write clean, testable Python. You care about model serving, pipeline reliability, and monitoring — not just offline metrics. Your models don't live in notebooks; they run in production and you own them there.

  • You're a systems thinker. You don't optimize one metric in a vacuum — you understand how models, data flows, customer behavior, and business outcomes connect. You design for the system, not the silo.

  • You're a product-minded ML leader. You understand that a model with great offline metrics is useless if it doesn't move the customer's business. You frame every technical decision in terms of the outcome it enables — and you teach your team to do the same.

  • You have deep B2C business knowledge. You understand the problems that consumer businesses actually face — customer acquisition vs. retention economics, lifecycle dynamics, basket composition, churn drivers, promotional cannibalization, channel attribution, demand elasticity. You've lived in this world and it informs how you build.

  • You've built recommendation, search, and/or customer-based ML systems in production — not just in research. You understand collaborative filtering, content-based methods, ranking systems, segmentation, propensity modeling, and when each applies.

  • You build configurable systems, not one-off models. You've designed model architectures and pipelines that work across multiple customers, segments, or contexts with tunable parameters — not bespoke rebuilds for every use case.

  • You create value from limited data. You know how to make pragmatic modeling choices when data is sparse, noisy, or cold-start. You reach for the right level of complexity — not the most impressive one.

  • You're rigorous about causality. You understand causal inference methods — difference-in-differences, instrumental variables, propensity scoring, synthetic controls — and you apply them when correlation isn't enough. You design A/B tests properly and know their limitations.

  • You communicate with clarity and conviction. You can present a causal analysis to a C-suite audience and make it land. You can write a one-pager that changes a decision. Communication is not a nice-to-have here — it's the job.

  • You take ownership at the team level. You don't just own your own models — you own the team's impact. If a pipeline breaks or a model underperforms, you treat it as your problem.

  • You thrive in ambiguity. Problem definitions shift. Data availability surprises you. You bring structure to chaos without killing speed — and you coach the team to operate the same way.

  • You move at startup speed and expect the same from your team. You understand what it means to be available, responsive, and biased toward action in a fast-moving, early-stage environment.

Strong pluses:
  • Experience with ML infrastructure at scale — feature stores, model serving, orchestration, monitoring, retraining pipelines

  • Experience with experimentation platforms and A/B testing infrastructure

  • Exposure to retail, e-commerce, CPG, or marketplace data environments

  • Prior experience as an ML lead, principal ML engineer, or founding ML/data science engineer at an early-stage or high-growth company

  • Track record of hiring and developing ML engineers and data scientists — not just managing them

  • Experience with real-time and batch ML serving patterns in production

You might be:

A senior ML engineer at a B2C company who's tired of optimizing one metric and wants to build the whole system. An ML lead at a startup who's ready for higher stakes and more ownership. Someone who came up through quantitative research or applied science and moved into ML engineering because they wanted to see models run in production, not just in papers. A founding ML engineer who built the function at their last company and wants to do it again where ML is the product. What matters: you think in systems, you understand the business, you ship models that work with real data in real production environments, and you communicate impact — not just methodology.

Location

San Francisco, US

Compensation

Competitive salary + equity reflecting the seniority and scope of the role. Compensation details and structure shared in next steps.

The Hiring Journey

Short form → Intro call → Practical working session → Team conversations → Offer

Fast, human, no bureaucracy.

Top Skills

A/B Testing Infrastructure
Experimentation Platforms
Feature Stores
Model Serving
Monitoring
Orchestration
Python
HQ

Hilbert's AI San Francisco, California, USA Office

San Francisco, CA, United States

Similar Jobs

2 Hours Ago
In-Office or Remote
7 Locations
79K-99K Annually
Entry level
79K-99K Annually
Entry level
Artificial Intelligence • Fintech • Information Technology • Logistics • Payments • Business Intelligence • Generative AI
As an Associate Product Strategist, you will automate and digitize partner lifecycles using AI tools, creating systems for application, onboarding, and partner management.
Top Skills: Ai ToolsAirtableAPIsAutomation ToolsClaude CodeLow-Code ToolsNo-Code ToolsNotionRetoolSheetsZapier
2 Hours Ago
Easy Apply
Hybrid
San Francisco, CA, USA
Easy Apply
146K-183K Annually
Mid level
146K-183K Annually
Mid level
Artificial Intelligence • Cloud • Software
The Corporate Events Manager will plan and execute Vercel's flagship events, collaborating with teams to shape keynotes and manage logistics, budgets, and KPIs, delivering events that inspire developers and drive company goals.
Top Skills: Event Management
2 Hours Ago
Easy Apply
Hybrid
San Francisco, CA, USA
Easy Apply
168K-210K Annually
Mid level
168K-210K Annually
Mid level
Artificial Intelligence • Cloud • Software
The Media Engineer will manage Vercel's social media presence, develop strategies, produce content, and analyze performance across platforms like X, TikTok, and Instagram.
Top Skills: Ai Tools

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account