Data Engineer

Sorry, this job was removed at 06:16 p.m. (PST) on Monday, Jan 05, 2026

In-Office

San Francisco, CA, USA

In-Office

San Francisco, CA, USA

Similar Jobs

Notion

Data Engineer

5 Days Ago

Hybrid

San Francisco, CA, USA

213K-250K Annually

Senior level

213K-250K Annually

Senior level

Artificial Intelligence • Productivity • Software

Design and operate scalable data systems for People Analytics, ensuring reliability and privacy of sensitive employee data. Collaborate with leadership to create data products for decision-making.

Top Skills: AirflowAtsDbtFivetranPythonSnowflakeSQLWorkday

Edmunds

Data Engineer

6 Days Ago

Remote or Hybrid

USA

125K-159K Annually

Mid level

125K-159K Annually

Mid level

AdTech • Automotive • Big Data • Consumer Web

As a Data Engineer at Edmunds, you will architect and maintain data platforms for analytics and AI/ML, ensuring scalable, reliable data processing and supporting cross-functional collaboration.

Top Skills: AirflowAWSDatabricksPythonScalaSparkSQL

Capital One

Data Engineer

10 Days Ago

Hybrid

San Francisco, CA, USA

269K-335K Annually

Senior level

269K-335K Annually

Senior level

Fintech • Machine Learning • Payments • Software • Financial Services

Lead the design of scalable data architectures, optimize data processing workloads, and mentor talent while promoting modern technologies and engineering excellence.

Top Skills: AIAWSDatabricksMlPythonScalaSnowflakeSQL

About Prima Mente

Prima Mente is a frontier biology AI lab. We generate our own data, build general purpose biological foundation models, and translate discoveries into research and clinical outcomes. Our first goal is to tackle the brain: to deeply understand it, protect it from neurological disease, and enhance it in health. Our team of AI researchers, experimentalists, clinicians, and operators is based in London, San Francisco and Dubai.

Role focus - Biological Data Infrastructure at Petabyte Scale

Key Tasks:

Owning and scaling our data infrastructure by several orders of magnitude to handle > 100 petabyte-scale multi-omic datasets, including data pipelines, distributed data processing, and storage systems
Building a unified feature store for all our ML models and biological data analysis workflows
Efficiently storing and loading petabytes of data for ML bio data
Processing and storing predictions and evaluation metrics for large-scale biological forecasting and analysis models
Implementing data versioning and point-in-time correctness systems for evolving biological datasets
Building observable, debuggable data pipelines that handle the complexity of multi-omic data sources

Expected Growth

In 1 month you will be responsible for analyzing current data infrastructure bottlenecks, implementing initial optimizations to existing pipelines, and beginning work on scaling our feature store infrastructure for ML models.

In 3 months you'll directly own and have scaled key components of our data processing systems, built prototype streaming pipelines for real-time data ingestion, and contributed to designing our unified feature store architecture.

In 6 months you'll have implemented high-performance petabyte-scale data infrastructure, established data versioning and point-in-time correctness systems, and delivered measurable improvements in data processing throughput and reliability.

Why Join Us

Meaningful Impact: Contribute directly to research infrastructure that powers discoveries potentially impacting millions of lives.
Innovation & Autonomy: Work at the forefront of AI and multi-omics, with the freedom to propose and implement state-of-the-art infrastructure solutions.
Exceptional Team: Collaborate with talented colleagues from diverse backgrounds across ML, bioinformatics, and engineering.
Growth Opportunities: Continuous learning and growth opportunities in a rapidly advancing technical field.

Who You Are

We don’t expect you to check every box. Strong applicants often have depth in some of these and interest in growing into others

4+ years of experience building data infrastructure or data platforms with demonstrated ability to solve complex distributed systems problems independently
Experience building infrastructure for large-scale data processing pipelines (both batch and streaming) using tools like Spark, Kafka, Apache Flink, Apache Beam, and with proprietary solutions like Nebius
Experience designing and implementing large-scale data storage systems (feature stores, timeseries DBs) for ML use cases, with strong familiarity with relational databases, data warehouses, object storage, and expertise in DB schema design
Experience with ML infrastructure and have worked at companies that use ML for core business functions
Experience building data pipelines for external data sources that are observable, debuggable, and verifiably correct, having dealt with challenges like data versioning, point-in-time correctness, and evolving schemas
Strong distributed systems and infrastructure skills - comfortable scaling and debugging Kubernetes services, writing Terraform, and working with orchestration tools like Flyte, Airflow, or Temporal
Experience with cloud platforms (AWS, GCP, Azure) and container technologies
Strong software engineering skills with ability to write easy-to-extend and well-tested code
Excellent communication skills and experience collaborating within multidisciplinary teams
Comfortable with ambiguity and a fast-moving environment, with a bias for action
Learn and pick up new skills quickly
Familiarity with bioinformatics or biological data handling
Knowledge of data governance, compliance, and security standards relevant to healthcare or biotech

Location

Based in San Francisco, US or London, UK. We support visa applications.

Culture Insight

What we are doing is extremely hard. Prima Mente is for great people. We are team players who appreciate challenges, want to be hands-on, and thrive on curiosity by throwing away assumptions. We are focused on excellence at pace and huge personal growth. We are strong communicators who are highly disciplined and rigorous.

Prima Mente operates with a flat organizational structure. We gain and share knowledge by contributing to multiple opportunities. Leadership is given to those who show initiative and consistently deliver excellence.

We arrange our lives so we can work in person as much as possible.

Our ValuesExceptional performance at exceptional pace

The solutions we build demand uncompromising quality and rigour.
The problems we are solving are grave and present.

Inquisitive discovery

We embrace curiosity and creativity.
Every question is a path to a transformational breakthrough.

Radical candour

We practice unwavering honesty and transparency in all our challenges and interactions.

Purposeful individuality

Every individual in our team is celebrated for their identity, uniqueness, and experiences.
We are invested in each one’s bespoke personal development.
Nurturing individuality will supercharge our collective purpose and spirit.

Patient impact at scale

We have a steadfast commitment to improve the health and well-being of patients globally.
Every experiment run, every dataset analysed, and every innovation developed, is a step towards achieving a scalable impact.

San Francisco, California, United States

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Prima Mente

Data Engineer

Similar Jobs

Data Engineer

Data Engineer

Data Engineer

Prima Mente San Francisco, California, USA Office

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech