Hilbert's AI Logo

Hilbert's AI

AI Engineer - Enterprise

Reposted 3 Days Ago
Hybrid
San Francisco, CA, USA
Senior level
Hybrid
San Francisco, CA, USA
Senior level
The AI Engineer will design and build production-grade AI systems, translate enterprise challenges into solutions, and collaborate with customers to enhance AI capabilities while managing the technical direction of the AI stack.
The summary above was generated by AI
Hilbert is building a reasoning engine that must navigate non-deterministic user behavior across data silos — turning months-long decision cycles into minutes. Fully agentic by design, our demand intelligence platform doesn't just call APIs; it solves the hard problem of orchestrating multi-step inference over messy, high-stakes enterprise data where deterministic answers don't exist.

From Fortune 500 enterprises to beloved brands like FreshDirect, Blank Street, and Levain Bakery, operators run their growth on Hilbert. We're also co-building alongside leading AI companies.

We're looking for an AI Engineer who can build production-grade AI systems end-to-end and serve as the technical AI counterpart for our largest enterprise customers — understanding their workflows, translating their challenges into agentic solutions, and earning their trust through clarity, rigor, and results. All with the ownership and urgency of a startup culture.

This is not a "wire up a prompt chain and move on" role. You'll own core pieces of the AI stack that power Hilbert's demand intelligence platform — designing agent architectures, building evaluation systems, and making hard tradeoffs between accuracy, latency, and cost in production. You'll also be the person our biggest customers look to when they want to understand what the AI is doing, why it made a particular decision, and how it can be shaped to solve their specific problems. If you think in systems, have opinions about how agentic workflows should actually work, can hold your own in a room full of enterprise stakeholders, and want to build AI products that drive real outcomes, we want to meet you.

THE ROLE

You'll work directly with the founding team and across product, data, and GTM to design, build, and improve the AI systems at the heart of Hilbert — with a particular focus on our largest enterprise accounts. You'll be hands-on every day — building agents, designing workflows, shipping to production — but you'll also be the technical AI voice in customer conversations: understanding their business context firsthand, shaping how we apply our agentic systems to their problems, presenting capabilities and results, and building the trust that turns a vendor relationship into a strategic partnership.

The environment is high-autonomy and high-ambiguity — the nature of building AI-native products means requirements shift, approaches evolve, and the person closest to the problem often makes the call. In this role, you're often the person closest to both the technology and the customer.

What you'll do:

Build

  • Design, build, and maintain AI-driven features and pipelines that serve enterprise customers at scale

  • Architect and implement agent-based workflows using LangChain, LangGraph, or equivalent orchestration frameworks

  • Own systems end-to-end — from experimentation through production deployment and monitoring

  • Build and improve evaluation pipelines to measure, validate, and iterate on AI system performance

  • Make pragmatic engineering decisions under ambiguity — ship, learn, iterate

  • Shape the technical direction of the AI stack as the company scales

Partner with enterprise customers

  • Be the technical AI counterpart for our largest accounts — understanding their workflows, data environment, and business challenges firsthand, and translating them into agentic solutions

  • Present AI capabilities, results, and roadmap to senior customer stakeholders with clarity, conviction, and appropriate nuance — you're the person they trust to explain what the system does and why

  • Translate customer context into engineering decisions — what you learn in customer conversations directly informs how you design agents, workflows, and integrations. You don't build in a vacuum; you build with deep knowledge of how the output will be used

  • Hold the line on what AI can and can't do — when customers want a simpler story than reality supports, or push for capabilities that aren't ready, you find a way to be honest and helpful at the same time. You build trust through intellectual integrity, not through overpromising

  • Design customer-specific configurations and integrations — enterprise customers have unique platforms, data flows, and operational requirements. You own the technical work of making our agentic systems fit their world, combined with human-in-the-loop elements that keep enterprise trust intact

  • Feed enterprise learnings back into the product — patterns you see across customers, gaps in our systems, new workflow opportunities. Your customer exposure makes the whole team smarter

Our Current Hurdles

These are the kinds of problems you'll walk into on day one:

  • Intelligent retrieval across heterogeneous approaches — our agents need the right information at exactly the right moment. The challenge isn't picking one retrieval method; it's combining RAG, graph-based retrieval, and other approaches into a unified strategy that fetches the most relevant content precisely when the agent needs it — no more, no less. In the enterprise context, this means working with customer data environments that vary wildly in structure, quality, and accessibility.

  • Agentic workflows that solve real-world problems — it's building workflows robust enough to handle the unexpected. When an agent hits an edge case, missing data, or a situation it wasn't explicitly designed for, it needs to reason through it — leveraging available context, escalating to a human when it can't, and never silently failing. You'll be the person in the room when a customer asks "what happens when it encounters X?" — and the answer needs to be credible.

  • Evaluation beyond vibes — we need systematic, reproducible evals that actually predict real-world performance. If you've built custom evaluators for RAG or agent workflows, we want to talk. In enterprise accounts, you'll also need to communicate evaluation results to customers in a way that builds confidence and sets appropriate expectations.

  • Execution and real-world integration — an agent that only surfaces insights isn't enough. We're building systems where agents take action — integrating with external platforms, executing workflows, and doing real work with the information they have, combined with human-in-the-loop checkpoints that keep enterprise trust intact. Each enterprise customer has different platforms, different operational flows, and different tolerance for automation — and you'll own making it work.

WHO THRIVES IN THIS ROLE

We care about how you think, how you ship, and how you show up with customers — not how many years are on your resume.

The profile:
  • You're a strong software engineer. Your code is clean, testable, and production-ready.

  • You have real experience with LangChain, LangGraph, or equivalent agent/orchestration frameworks. You've built with them, hit their limits, and worked around them — not just followed tutorials.

  • You're a trusted technical partner to enterprise stakeholders. You've been in the room with senior audiences and presented technical work in a way that earned trust and drove decisions. You're comfortable with hard questions, pushback, and the ambiguity of enterprise conversations. You don't oversell, you don't hide behind jargon, and you know how to make AI capabilities accessible without dumbing them down.

  • You're a product-minded engineer. You understand that a technically impressive agent is useless if it doesn't solve the customer's actual problem. You care as much about the why as the how — and your customer exposure keeps you grounded in what matters.

  • You communicate with clarity and conviction. You can explain a technical decision to a non-technical founder, debate architecture tradeoffs with a senior engineer, and walk an enterprise VP through an agentic workflow — all in the same day. Communication is not a nice-to-have here — it's the job.

  • You take ownership. You don't wait for tickets. You see what needs to be built, raise your hand, and ship it. If a customer isn't getting value or an integration isn't working, you treat it as your problem.

  • You thrive in ambiguity. AI products evolve fast. Customer priorities shift. Requirements change. You're energized by figuring it out — and you bring the customer along on the journey.

  • You move at startup speed. You understand what it means to be available, responsive, and biased toward action in a fast-moving, early-stage environment.

Strong pluses:
  • Experience in customer-facing technical roles — solutions engineering, applied AI, technical account management, or consulting where you owned the technical relationship

  • Experience building eval pipelines — designing metrics, running systematic evaluations, and using results to drive iteration on AI systems

  • Backend software engineering experience — building APIs, services, data infrastructure, or production systems

  • Exposure to retrieval-augmented generation (RAG), vector databases, or LLM-powered search and recommendation systems

  • Deep exposure to retail, e-commerce, or enterprise B2C environments and the business teams that operate in them

  • Experience at early-stage startups or high-growth environments where you wore multiple hats

You might be:

A backend engineer who went deep on LLMs and has always been the person pulled into customer conversations because you can explain what the system actually does. An AI engineer at a platform company who's tired of building for an internal team and wants to see impact face-to-face. A solutions engineer or applied AI engineer who's ready to go deeper on the building side without losing the customer connection. Someone at a larger company who's frustrated by the wall between "the people who build" and "the people who talk to customers" and wants to be both. A startup CTO who wants to go deep on AI at a company where the stack is the product and the customer relationship is the feedback loop. What matters: you ship, you own it, you can hold your own in a room full of enterprise stakeholders, and you communicate like a partner — not a silo.

Location

San Francisco, with occasional travel for team meets, offsites, or customer engagements.

Compensation

Competitive salary + equity package, commensurate with experience. Performance-based bonuses tied to project milestones and customer impact.

The Hiring Journey

Short form → Intro call → Technical working session → Team conversations → Offer

Fast, human, no bureaucracy.

Top Skills

Langchain
Langgraph
HQ

Hilbert's AI San Francisco, California, USA Office

San Francisco, CA, United States

Similar Jobs

4 Days Ago
In-Office
San Francisco, CA, USA
150K-350K Annually
Mid level
150K-350K Annually
Mid level
Artificial Intelligence • HR Tech • Productivity • Software
As an AI/ML Research Engineer, you will research and implement agent architectures for GUI automation, develop reasoning pipelines, and collaborate with engineering teams to turn research into production systems.
Top Skills: Cloud InfrastructurePythonPyTorch
17 Days Ago
In-Office
San Francisco, CA, USA
216K-270K Annually
Senior level
216K-270K Annually
Senior level
Artificial Intelligence • Big Data • Machine Learning
As a Staff Applied AI Engineer, you will develop AI solutions for enterprise clients, managing projects from requirement gathering to code implementation, with a focus on complex technical issues in AI.
Top Skills: AIAWSGCPMachine LearningPython
5 Days Ago
Hybrid
Belmont, CA, USA
93K-250K Annually
Mid level
93K-250K Annually
Mid level
Artificial Intelligence • Cloud • Events • Productivity • Software • Business Intelligence • Conversational AI
The role involves architecting and building AI solutions, collaborating with stakeholders, developing prototypes, and driving AI product direction at RingCentral.
Top Skills: AWSAzureGCPJavaScriptMongoDBNetSuitePostgresPythonRedisSalesforceTypescriptWorkday

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account