The role involves building and managing scalable data infrastructure, optimizing multi-cluster orchestration, and ensuring data access for MLOps and research while transitioning from legacy systems to modern storage.
About Mistral
At Mistral AI, we believe in the power of AI to simplify tasks, save time, and enhance learning and creativity. Our technology is designed to integrate seamlessly into daily working life.
We democratize AI through high-performance, optimized, open-source and cutting-edge models, products and solutions. Our comprehensive AI platform is designed to meet enterprise as well as personal needs. Our offerings include Le Chat, La Plateforme, Mistral Code and Mistral Compute - a suite that brings frontier intelligence to end-users.
We are a dynamic, collaborative team passionate about AI and its potential to transform society. Our diverse workforce thrives in competitive environments and is committed to driving innovation. Our teams are distributed between France, USA, UK, Germany and Singapore. We are creative, low-ego and team-spirited.
Join us to be part of a pioneering company shaping the future of AI. Together, we can make a meaningful impact. See more about our culture on https://mistral.ai/careers.
Role Summary
This role focuses on building and operating the next generation of data infrastructure at Mistral AI. You will be a core contributor to our evolution, helping us design and scale massive compute fleets and storage systems designed for high performance and scalability.
You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research. You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs.
You will help us move toward a future of decoupled control and data planes, scaling big data compute and storage platforms while ensuring secure and governed data access for MLOps and research. You will take full lifecycle ownership: from architecting the migration away from legacy orchestrators to implementing production-grade pipelines and participating in on-call rotations for critical training jobs.
What will you do
• Build & Scale: Help us reach our goal of operating massive distributed compute and storage systems
• Global Orchestration: Architect and maintain multi-cluster orchestration layers to optimize workload placement across diverse hardware and regions.
• Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth.
• Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments.
• Metadata & Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity.
• Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient.
• Design Future-Proof Storage: Architect our transition to modern storage formats to handle fine-tuning datasets at a scale that anticipates exabyte growth.
• Platform Engineering: Contribute to the development of our internal training platform, ensuring seamless model training and fine-tuning capabilities across Kubernetes and SLURM based environments.
• Metadata & Lineage: Implement and manage systems to provide clear visibility and lineage as our data and model pipelines grow in complexity.
• Operational Excellence: Use modern deployment workflows to manage cloud-native deployments, ensuring our data platform can scale by orders of magnitude while remaining reliable and efficient.
About you
• Have 4+ years of experience in Data Infrastructure, MLOps, or Infrastructure Engineering.
• Have experience or a strong interest in supporting foundational compute and storage platforms.
• Are proficient in Python and enjoy solving the "brittle data lake" problem with modern, columnar storage standards.
• Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments.
• Take pride in building and operating scalable, reliable, and secure systems from the ground up.
• Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment.
• Have experience or a strong interest in supporting foundational compute and storage platforms.
• Are proficient in Python and enjoy solving the "brittle data lake" problem with modern, columnar storage standards.
• Are well-versed in Kubernetes-native tooling and excited to debug large-scale distributed systems across multi-cluster environments.
• Take pride in building and operating scalable, reliable, and secure systems from the ground up.
• Are comfortable with ambiguity and the challenges of building high-scale infrastructure in a rapid-growth AI environment.
What we offer
- 💰 Competitive salary and equity.
- 🚑 Healthcare: Medical/Dental/Vision covered for you and your family.
- 👴🏻 Pension : 401K (6% matching)
- 🏝️ PTO : 18 days
- 🚗 Transportation: Reimburse office parking charges, or $120/month for public transport
- 🏀 Sport: $120/month reimbursement for gym membership
- 🥕 Meal stipend: $400 monthly allowance for meals (solution might evolve as we grow bigger)
- 🌎 Visa sponsorship
- 🤝 Coaching: we offer BetterUp coaching on a voluntary basis
By applying, you agree to our Applicant Privacy Policy.
Similar Jobs
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Design, implement, and scale finance systems (primarily Oracle Cloud Fusion) to automate and streamline core finance processes, enable international expansion, drive AI-enabled automation, manage solution design and testing, and support finance teams for internal and SOX audits.
Top Skills:
Accounting HubAi ToolsApple MacosCash ManagementFdi ReportingGoogle Workspace (G Suite)KyribaNavan Travel And ExpenseOracle Cloud Fusion ErpPayablesReceivablesRevenue ManagementSlackSubledger AccountingSubscription ManagementWorkivaZip Procurement To Pay
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Lead and enhance Circle's IT SOX compliance program by evaluating ITGCs, automated and IT-dependent controls, documenting SOX-ready materials, coordinating testing and remediations, liaising with IT, Security, Finance and auditors, and driving control design for system implementations and new product launches.
Top Skills:
Ai ToolsApple MacosAudit/Compliance Management ToolsCloud EnvironmentsCybersecurity ControlsErp SystemsFinancial ApplicationsGoogle WorkspaceGrc PlatformsIdentity And Access Management (Iam)SaaSSdlcSlack
Artificial Intelligence • Big Data • Enterprise Web • Fintech • Software • Financial Services
Design, develop, and maintain scalable customer-facing web applications and RESTful APIs using Java and Spring Boot. Implement microservices and cloud infrastructure on AWS, manage data with SQL/Postgres and file-based stores, build CI/CD pipelines, monitor performance with New Relic/Splunk, write automated tests to Sonar standards, participate in code reviews, mentor engineers, and produce technical documentation and architecture diagrams.
Top Skills:
AngularAWSAws LambdaCloudbeesCloudFormationDynamoDBEc2GitHarnessJavaJavaScriptJenkinsMicroservicesNew RelicPostgresReactRestful ApisS3SonarqubeSplunkSpring BootSQLSqsTerraformVueXML
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine


