Hilbert's AI Logo

Hilbert's AI

Lead AI Engineer

Reposted 2 Days Ago
Be an Early Applicant
Hybrid
San Francisco, CA, USA
Senior level
Hybrid
San Francisco, CA, USA
Senior level
Lead the technical direction and hands-on implementation of Hilbert's AI stack: design and ship agent-based workflows, production-grade LLM systems, evaluation pipelines, and monitoring. Set architecture and engineering standards, prioritize work, and hire/mentor engineers while collaborating across product, data, and GTM.
The summary above was generated by AI
Hilbert is building a reasoning engine that must navigate non-deterministic user behavior across data silos — turning months-long decision cycles into minutes. Fully agentic by design, our demand intelligence platform doesn't just call APIs; it solves the hard problem of orchestrating multi-step inference over messy, high-stakes enterprise data where deterministic answers don't exist.

We're looking for a Lead AI Engineer who can own the technical direction of Hilbert's AI stack, ship production-grade systems hands-on, and elevate a growing engineering team — all with the ownership and urgency of a founder.

This is not a "manage from a distance" role. You'll write code, make architecture calls, set the bar for quality and velocity, and build the engineering culture as we scale. You're the person the team looks to when the problem is ambiguous, the stakes are high, and the path forward hasn't been written yet. If you combine deep technical craft with the ability to lead people and communicate with clarity, we want to meet you.

Why Hilbert AI

Hilbert is building the demand intelligence platform used by world-class B2C leaders — including the world's largest retailer — to unlock compounding growth outcomes. We sit at the intersection of AI, data, and commercial activation for retail and e-commerce.

We're scaling fast with top-tier investors behind us. The AI stack is the product — which means engineering leadership here has direct, measurable impact on enterprise customers and revenue. We're a small, talent-dense, low-ego team. We value ownership, speed, intellectual honesty, and shipping real impact.

The Role

You'll work directly with the founding team and across product, data, and GTM to lead the design, development, and evolution of the AI systems at the heart of Hilbert. You'll be hands-on and in the code daily — but you'll also be the person who defines how we build, how we prioritize, and how we grow the engineering team. The environment is high-autonomy and high-ambiguity — the nature of building AI-native products means requirements shift, approaches evolve, and the person closest to the problem often makes the call. As Lead, you make sure the team is equipped to make those calls well.

What you'll do:

Build — hands-on, every day

  • Design, build, and maintain AI-driven features and pipelines that serve enterprise customers at scale

  • Architect and implement agent-based workflows using LangChain, LangGraph, or equivalent orchestration frameworks

  • Own critical systems end-to-end — from experimentation through production deployment and monitoring

  • Build and improve evaluation pipelines to measure, validate, and iterate on AI system performance

  • Make pragmatic engineering decisions under ambiguity — ship, learn, iterate

Lead — set direction and raise the bar

  • Define and own the technical roadmap for the AI stack in partnership with the founding team

  • Make architecture and infrastructure decisions that balance speed today with scalability tomorrow

  • Set engineering standards — code quality, review practices, testing, documentation, and deployment discipline

  • Prioritize ruthlessly across competing demands, keeping the team focused on highest-impact work

  • Communicate technical strategy, tradeoffs, and progress clearly to founders and non-technical stakeholders

  • Be the tiebreaker when the team is stuck — on architecture, approach, or prioritization

Grow — build the team and the culture

  • Hire, mentor, and develop AI engineers as the team scales

  • Create an environment of ownership, intellectual honesty, and high-velocity shipping

  • Run effective processes without bureaucracy — standups, reviews, retros that actually help

  • Identify skill gaps and build the team to fill them — whether through hiring, upskilling, or restructuring work

  • Lead by example: the team sees you in the code, in the reviews, in the hard problems — not just in meetings

Our Current Hurdles

These are the kinds of problems you'll walk into on day one — and you'll be the one setting the strategy for how we solve them.

  • Intelligent retrieval across heterogeneous approaches — our agents need the right information at exactly the right moment. The challenge isn't picking one retrieval method; it's combining RAG, graph-based retrieval, and other approaches into a unified strategy that fetches the most relevant content precisely when the agent needs it — no more, no less. As Lead, you'll define the retrieval architecture and decide when to invest in new approaches versus optimize existing ones.

  • Agentic workflows that solve real-world problems — the hard part isn't speed; it's building workflows robust enough to handle the unexpected. When an agent hits an edge case, missing data, or a situation it wasn't explicitly designed for, it needs to reason through it — leveraging available context, escalating to a human when it can't, and never silently failing. You'll own the design philosophy for how our agents degrade gracefully and recover intelligently.

  • Evaluation beyond vibes — we need systematic, reproducible evals that actually predict real-world performance. You'll build the evaluation culture — defining what "good" looks like, choosing the metrics that matter, and making sure the team ships with evidence, not intuition.

  • Execution and real-world integration — an agent that only surfaces insights isn't enough. We're building systems where agents take action — integrating with external platforms, executing workflows, and doing real work with the information they have, combined with human-in-the-loop checkpoints that keep enterprise trust intact. You'll architect the integration layer and own the reliability bar.

Who You Are

We care about how you think, how you ship, and how you make others around you better.

The profile:
  • You're a strong Python engineer first. Your code is clean, testable, and production-ready. You haven't left the codebase behind — you lead from inside it.

  • You have deep experience with LangChain, LangGraph, or equivalent agent/orchestration frameworks. You've built with them at scale, hit their limits, designed around them, and have opinions about when to use them and when not to.

  • You're a product-minded engineering leader. You understand that a 99% accurate model is useless if it doesn't solve the customer's quarterly revenue gap. You set technical direction based on business outcomes, not just technical elegance — and you teach your team to think the same way.

  • You communicate with clarity and conviction. You can align a team around a technical direction, explain a tradeoff to a non-technical founder, and give direct, constructive feedback to an engineer — all in the same day. Communication is not a nice-to-have here — it's the job.

  • You take ownership at the team level. You don't just own your own output — you own the team's output. If something falls through the cracks, you treat it as your problem.

  • You thrive in ambiguity and help others do the same. AI products evolve fast. You bring structure to chaos without killing speed — and you coach the team to operate the same way.

  • You move at startup speed and expect the same from your team. You understand what it means to be available, responsive, and biased toward action in a fast-moving, early-stage environment. You set that tempo.

Strong pluses:
  • Experience building eval pipelines — designing metrics, running systematic evaluations, and using results to drive iteration on AI systems

  • Backend software engineering experience — building APIs, services, data infrastructure, or production systems beyond the ML/AI layer

  • Exposure to retrieval-augmented generation (RAG), vector databases, or LLM-powered search and recommendation systems

  • Prior experience as a tech lead, engineering manager, or founding engineer at an early-stage or high-growth company

  • Track record of hiring and developing engineers — not just managing them

You might be:

A senior engineer or tech lead at a startup who's ready to own the entire AI function. A founding engineer who built a team around themselves and wants to do it again at a company where the AI stack is the product. An engineering manager who refuses to stop coding. Someone who's been leading agents and LLM infrastructure work at a larger company and wants full ownership and zero bureaucracy. What matters: you ship, you lead from the front, you raise the bar for everyone around you, and you communicate like a partner

Location

San Francisco, US

Compensation

Competitive salary + equity reflecting the seniority and scope of the role. Compensation details and structure shared in next steps.

The Hiring Journey

Short form → Intro call → Technical working session → Team conversations → Offer

Fast, human, no bureaucracy.

HQ

Hilbert's AI San Francisco, California, USA Office

San Francisco, CA, United States

Similar Jobs

Yesterday
Hybrid
San Jose, CA, USA
230K-286K Annually
Senior level
230K-286K Annually
Senior level
Fintech • Machine Learning • Payments • Software • Financial Services
Lead design, build, deploy, and support large-scale generative AI systems including foundation model training, LLM inference, similarity search, guardrails, evaluation, governance and observability. Improve performance, scalability, cost, latency, and throughput and contribute to technical vision and roadmap while partnering with cross-functional teams.
Top Skills: AWSAws UltraclustersAzureC#C++GoGCPHuggingfaceJavaLlm InferenceNemo GuardrailsPythonPyTorchScalaSimilarity SearchVectordbs
2 Days Ago
Hybrid
San Jose, CA, USA
197K-246K Annually
Mid level
197K-246K Annually
Mid level
Fintech • Machine Learning • Payments • Software • Financial Services
Lead AI Engineer to design, build, deploy, and support production-scale vision and foundation-model components (training, inference, vector search, guardrails, evaluation, observability). Partner with cross-functional teams, optimize LLM training/inference for cost/latency/throughput, and help set the technical vision and roadmap for foundational AI systems at Capital One.
Top Skills: GoJavaPythonScala
3 Days Ago
Hybrid
San Jose, CA, USA
197K-246K Annually
Mid level
197K-246K Annually
Mid level
Fintech • Machine Learning • Payments • Software • Financial Services
Design, develop, deploy, and support foundational AI systems (foundation model training, LLM inference, similarity search, guardrails, evaluation, observability). Optimize training/inference for scalability, cost, latency, and throughput. Partner cross-functionally to deliver production AI services and shape the technical vision and roadmap for foundational AI at scale.
Top Skills: AWSAws UltraclustersAzureC#C++GoGCPHugging FaceJavaLlm InferenceNemo GuardrailsPythonPyTorchScalaSimilarity SearchVectordbs

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account