Mistral AI Logo

Mistral AI

Research Engineer, Machine Learning

Reposted 26 Days Ago
Hybrid
Palo Alto, CA, USA
Mid level
Hybrid
Palo Alto, CA, USA
Mid level
As a Research Engineer in Machine Learning, you'll optimize large-scale ML systems, integrate research with production, and conduct experiments on deep-learning techniques, all while collaborating closely with Research Scientists.
The summary above was generated by AI
About Mistral 
 
At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.
 
We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise as well as personal needs. Our offerings include Le Chat, La Plateforme, Mistral Code and Mistral Compute - a suite that brings frontier intelligence to end-users.
 
We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited.
 
Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on https://mistral.ai/careers.

Role Summary 

About the Research Engineering team

The team spans Platform (shared infra & clean code) and Embedded (inside research squads). Engineers can move along the research↔production spectrum as needs or interests evolve.

As a Research Engineer – ML track, you’ll build and optimise the large-scale learning systems that power our open-weight models. Working hand-in-hand with Research Scientists, you’ll either join:

- Platform RE Team: Enhance the shared training framework, data pipelines and cluster tooling used by every team; or
- Embedded RE Team: Sit inside a research squad (Alignment, Pre-training, Multimodal, …) and turn fresh ideas into repeatable, scalable code.


What will you do

• Accelerate researchers by taking on the heavy parts of large-scale ML pipelines and building robust tools.
• Interface cutting-edge research with production: integrate checkpoints, streamline evaluation, and expose APIs.
• Conduct experiments on the latest deep-learning techniques (sparsified 70 B + runs, distributed training on thousands of GPUs).
• Design, implement and benchmark ML algorithms; write clear, efficient code in Python.
• Deliver prototypes that become production-grade components for Le Chat and our enterprise API.

About you

• Master’s or PhD in Computer Science (or equivalent proven track record).
• 4 + years working on large-scale ML codebases.
• Hands-on with PyTorch, JAX or TensorFlow; comfortable with distributed training (DeepSpeed / FSDP / SLURM / K8s).
• Experience in deep learning, NLP or LLMs; bonus for CUDA or data-pipeline chops.
• Strong software-design instincts: testing, code review, CI/CD.
• Self-starter, low-ego, collaborative.


What we offer

  • 💰 Competitive salary and equity.
  • 🚑 Healthcare: Medical/Dental/Vision covered for you and your family.
  • 👴🏻 Pension : 401K (6% matching)
  • 🏝️ PTO : 18 days 
  • 🚗 Transportation: Reimburse office parking charges, or $120/month for public transport
  • 🏀 Sport: $120/month reimbursement for gym membership
  • 🥕 Meal stipend: $400 monthly allowance for meals (solution might evolve as we grow bigger)
  • 🌎 Visa sponsorship 
  • 🤝 Coaching: we offer BetterUp coaching on a voluntary basis
  •  
    By applying, you agree to our Applicant Privacy Policy.

Similar Jobs

5 Days Ago
In-Office
Sunnyvale, CA, USA
Senior level
Senior level
Healthtech • Robotics
Design, train, and evaluate computer-vision perception models (anatomy, instruments, actions) for surgical video; develop temporal/video models and bench-mark SOTA; define perception I/O and move models from offline experiments to robust, real-time OR performance; establish continuous improvement loops, annotation pipelines, and partner cross-functionally to enable prototype-to-product deployment.
Top Skills: C++CnnsDinoJaxJepa-Style ModelsMaeNvidia IsaacNvidia JetsonPythonPyTorchReal-Time/Edge InferenceRobotics SimulatorsSelf-Supervised LearningTensorFlowTensorrtVideo TransformersVision Transformers
5 Days Ago
In-Office
San Francisco, CA, USA
Mid level
Mid level
Artificial Intelligence • Generative AI
Build distributed training, inference, and RL infrastructure; create libraries for large-scale data jobs; architect systems converting user data into training data; collaborate with researchers to accelerate iteration and reproducibility.
Top Skills: Data SystemsDistributed SystemsDistributed TrainingInference SystemsLanguage ModelsReinforcement Learning
14 Days Ago
In-Office
Palo Alto, CA, USA
175K-250K Annually
Mid level
175K-250K Annually
Mid level
3D Printing • Consulting • Design • Manufacturing
Design and build an end-to-end, high-throughput dataloading stack for massive multimodal datasets: formatting, preprocessing, filtering, sharding, caching, and streaming data to distributed GPU training with observability, reliability, and performance benchmarking.
Top Skills: AirflowC++CudaDagsterDockerGpuKubernetesMlflowPrefectPythonPyTorchRustW&B

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account