The role involves developing scalable ML algorithms, optimizing LLM inference performance, managing model serving pipelines, and collaborating with data engineers for efficient knowledge graph generation.
About zaimler
AI agents can't reason over data they don't understand. Enterprise data today is fragmented across dozens of systems with no shared context, meaning, or structure, and that's why most enterprise AI is failing. The shift from copilots to autonomous agents is creating an entirely new infrastructure layer, and we're building it.
zaimler is the context infrastructure for the agentic era: a platform that automatically discovers domain knowledge, maps relationships, and gives AI agents the semantic understanding to operate with precision at scale. Imagine knowledge graphs that support real-time inference, built for systems that need to reason, not just retrieve.
zaimler was founded by Biswajit Das (ex-VP Engineering, Truera), a Data Infra veteran and former Chief Architect at Visa, and Sofus Macskassy (ex-Director of Engineering, LinkedIn), who built one of the largest knowledge graphs in production in the industry at LinkedIn. We're growing and deploying with major enterprises across insurance, travel, and technology. If you want to build infrastructure that the next decade of enterprise AI runs on, we'd love to talk.
You’ll join our ML team focused on turning raw enterprise data into structured, contextualized knowledge graphs and embeddings. You’ll develop novel and highly scalable algorithms for ML and data engineering to make our overall system more efficient, experiment with new approaches for distilling large models into smaller, more efficient ones; improve retrieval, ranking, and reasoning performance through feedback loops; and prototype methods that help LLMs extract and act on real-world knowledge.
We're looking for someone who thrives on iteration, cares about building with rigor, and is hungry to learn from some of the best engineers and researchers in the field.
What You’ll Be Doing
- Build and maintain training infrastructure, feature stores, and model serving pipelines
- Optimize LLM inference performance — compute efficiency, memory management, latency, and throughput
- Read, debug, and contribute to LLM runtime and supporting library code (Rust and/or C++)
- Deploy and manage models at scale using tools like vLLM and Baseten
- Architect scalable pipelines for model training and serving across GPU infrastructure
- Collaborate with ML and data engineers to ensure the platform meets research and production needs
Prior Experience
- PhD in CS, ML, or a related field or MS with 4+ years of relevant industry experience
- Background in LLM optimization: inference efficiency, quantization, memory layout, or serving performance
- Ability to read, navigate, and debug LLM source code and underlying runtime libraries
- Comfortable in Rust and/or C++ at the systems level; strong Python required
- Strong algorithmic fundamentals — data structures, complexity, distributed systems
- Hands-on experience with model serving infrastructure (vLLM, Baseten, Triton, or similar)
- Experience setting up and scaling ML pipelines end-to-end
Nice to Have
- Familiarity with feature store design and management
- Experience with GPU cluster management and optimization
- Contributions to open-source ML infrastructure or LLM tooling
- Experience with Ray, ONNX, TensorRT, or similar optimization and serving frameworks
- Understanding of transformer internals and attention mechanisms at the implementation level
Why Join
- A rare chance to be a founding engineer shaping both company and product direction.
- Competitive salary, benefits, and meaningful equity.
- Work alongside engineers and researchers from LinkedIn, Visa, Meta, and Branch.
- Onsite culture in San Mateo, designed for deep collaboration and high-velocity building.
- Full benefits package (Medical, Dental, Vision, 401k).
- We sponsor H-1B visas and assist with immigration processes.
We value builders over résumés. If this role excites you but you don't check every box, we still want to hear from you. zaimler is an equal opportunity employer.
Similar Jobs
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Lead design and scaling of a generative conversational engine and platform: build scalable APIs, optimize dialog performance and memory/latency, implement logging/tracing/metrics, enable integrations with chat clients, and collaborate with ML and product teams to deliver enterprise-grade agentic AI features.
Top Skills:
APIsGenerative AiLlmsLoggingMetricsMsteamsSlackTracingWeb
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Lead development and scaling of a generative conversational AI platform: build extensible infrastructure and APIs for chat clients, optimize dialog engine for performance and low memory, implement logging/tracing/metrics and debugging tools, and collaborate with ML and product teams to deliver domain-specific, multilingual enterprise experiences.
Top Skills:
Agentic AiApi DesignConversation EngineGenerative AiLarge Language Models (Llms)Logging FrameworksMetrics/MonitoringMicrosoft Teams (Msteams)Multilingual TranslationSlackTracing FrameworksWeb
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Build and scale a generative conversational AI platform: design APIs, optimize dialog engine for low-latency and low-memory multilingual enterprise usage, implement logging/tracing/metrics, create tooling and interfaces for customization, and collaborate with ML, product, and support teams to deliver robust, scalable solutions.
Top Skills:
CloudGenerative AiLlmsLogging FrameworksMetrics SystemsMicrosoft TeamsReal-Time Multilingual TranslationSlackTracing FrameworksWeb Apis
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

