Mithrl Logo

Mithrl

Data Engineer, Knowledge Graphs

Reposted 20 Days Ago
In-Office
San Francisco, CA, USA
150K-200K Annually
Senior level
In-Office
San Francisco, CA, USA
150K-200K Annually
Senior level
The Data Engineer will build ETL pipelines and data models for biological datasets, design graph storage systems, create APIs, and maintain data integrity while collaborating with data scientists and engineers.
The summary above was generated by AI

ABOUT MITHRL

We imagine a world where new medicines reach patients in months, not years, and where scientific breakthroughs happen at the speed of thought.

Mithrl is building the world’s first commercially available AI Co-Scientist. It is a discovery engine that transforms messy biological data into insights in minutes. Scientists ask questions in natural language, and Mithrl responds with analysis, novel targets, hypotheses, and patent-ready reports.

Our traction speaks for itself:

  • 12X year-over-year revenue growth

  • Trusted by leading biotechs and big pharma across three continents

  • Driving real breakthroughs from target discovery to patient outcomes.

ABOUT THE ROLE

We are hiring a Data Engineer, Knowledge Graphs to build the infrastructure that powers Mithrl’s biological knowledge layer. You will partner closely with the Data Scientist, Knowledge Graphs to take curated knowledge sources and transform them into scalable, reliable, production ready systems that serve the entire platform.

Your work includes building ETL pipelines for large biological datasets, designing schemas and storage models for graph structured data, and creating the API surfaces that allow ML engineers, application teams, and the AI Co-Scientist to query and use the knowledge graph efficiently. You will also own the reliability, performance, and versioning of knowledge graph infrastructure across releases.

This role is the bridge between biological knowledge ingestion and the high performance engineering systems that use it. If you enjoy working on data modeling, schema design, graph storage, ETL, and scalable infrastructure, this is an opportunity to have deep impact on the intelligence layer of Mithrl.

WHAT YOU WILL DO

  • Build and maintain ETL pipelines for large public biological datasets and curated knowledge sources

  • Design, implement, and evolve schemas and storage models for graph structured biological data

  • Create efficient APIs and query surfaces that allow internal teams and AI systems to retrieve nodes, relationships, pathways, annotations, and graph analytics

  • Partner closely with the Data Scientists to operationalize curated relationships, harmonized variable IDs, metadata standards, and ontology mappings

  • Build data models that support multi tenant access, versioning, and reproducibility across releases

  • Implement scalable storage and indexing strategies for high volume graph data

  • Maintain data quality, validate data integrity, and build monitoring around ingestion and usage

  • Work with ML engineers and application teams to ensure the knowledge graph infrastructure supports downstream reasoning, analysis, and discovery applications

  • Support data warehousing, documentation, and API reliability

  • Ensure performance, reliability, and uptime for knowledge graph services

WHAT YOU BRING

Required Qualifications

  • Strong experience as a data engineer or backend engineer working with data intensive systems

  • Experience building ETL or ELT pipelines for large structured or semi structured datasets

  • Strong understanding of database design, schema modeling, and data architecture

  • Experience with graph data models or willingness to learn graph storage concepts

  • Proficiency in Python or similar languages for data engineering

  • Experience designing and maintaining APIs for data access

  • Understanding of versioning, provenance, validation, and reproducibility in data systems

  • Experience with cloud infrastructure and modern data stack tools

  • Strong communication skills and ability to work closely with scientific and engineering teams

Nice to Have

  • Experience with graph databases or graph query languages

  • Experience with biological or chemical data sources

  • Familiarity with ontologies, controlled vocabularies, and metadata standards

  • Experience with data warehousing and analytical storage formats

  • Previous work in a tech bio company or scientific platform environment

WHAT YOU WILL LOVE AT MITHRL

  • You will build the core infrastructure that makes the biological knowledge graph fast, reliable, and usable

  • Team: Join a tight-knit, talent-dense team of engineers, scientists, and builders

  • Culture: We value consistency, clarity, and hard work. We solve hard problems through focused daily execution

  • Speed: We ship fast (2x/week) and improve continuously based on real user feedback

  • Location: Beautiful SF office with a high-energy, in-person culture

  • Benefits: Comprehensive PPO health coverage through Anthem (medical, dental, and vision) + 401(k) with top-tier plans

We encourage you to apply even if you do not believe you meet every single qualification. Not all strong candidates will meet every single qualification as listed. Research shows that people who identify as being from underrepresented groups are more prone to experiencing imposter syndrome and doubting the strength of their candidacy, so we urge you not to exclude yourself prematurely and to submit an application if you're interested in this work. We think AI systems like the ones we're building have enormous social and ethical implications. We think this makes representation even more important, and we strive to include a range of diverse perspectives on our team.

HQ

Mithrl San Francisco, California, USA Office

44 Montgomery St, San Francisco, California, United States, 94104

Similar Jobs

5 Hours Ago
Hybrid
San Francisco, CA, USA
Junior
Junior
Artificial Intelligence • HR Tech • Information Technology • Machine Learning • Software • App development • Industrial
The Corporate Communications Lead will craft Workwhile's media narrative, create engaging content on labor trends, manage academic relations, and respond proactively to media events to position the company as a leader in labor market intelligence.
5 Hours Ago
In-Office
San Francisco, CA, USA
125K-195K Annually
Senior level
125K-195K Annually
Senior level
Fintech • Information Technology • Financial Services
Lead client transformation initiatives involving the Aladdin platform, ensuring effective solutions that align with client objectives and managing implementations across multiple functional areas.
Top Skills: Aladdin PlatformFix ProtocolSQL
5 Hours Ago
In-Office
San Francisco, CA, USA
125K-155K Annually
Mid level
125K-155K Annually
Mid level
Fintech • Information Technology • Financial Services
The Technical Project Manager role involves driving cross-functional initiatives, managing software development processes, and improving project execution effectiveness. Responsibilities include supporting planning, risk monitoring, and maintaining team alignment.
Top Skills: AirtableAWSAzureAzure DevopsConfluenceGCPJIRA

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account