Calico Life Sciences Logo

Calico Life Sciences

Senior Data Management Engineer

Reposted 11 Days Ago
Be an Early Applicant
In-Office
South San Francisco, CA
173K-180K
Senior level
In-Office
South San Francisco, CA
173K-180K
Senior level
The Senior Data Management Engineer will manage biological datasets, ensuring quality and accessibility while collaborating with scientists on data schema and best practices.
The summary above was generated by AI

Who We Are: 

Calico (Calico Life Sciences LLC) is an Alphabet-founded research and development company whose mission is to harness advanced technologies and model systems to increase our understanding of the biology that controls human aging. Calico will use that knowledge to devise interventions that enable people to lead longer and healthier lives. Calico’s highly innovative technology labs, its commitment to curiosity-driven discovery science, and, with academic and industry partners, its vibrant drug-development pipeline, together create an inspiring and exciting place to catalyze and enable medical breakthroughs. 

Position Description: 

As a Senior Data Management Engineer, you will work closely with Calico scientists, external collaborators, and contract research organizations to help store and provide access to large, complex, and diverse biological datasets. You will develop schemas to accurately capture and document experimental results and methods at an appropriate technical level. You will advise scientists in best practices for biological metadata management and maintaining data provenance. You will assist with sanitizing and transforming project data and metadata. You must be able to learn and work independently yet collaborate well with coworkers and share their passion to advance Calico’s quest to understand aging and age-related disease. 

Position Responsibilities: 

  • Work with scientists and engineers to identify optimal ways to prepare, annotate, store and navigate their datasets, including pairing with engineers on data application design and improvement
  • Define and document best practices for capturing and entering experimental metadata, and educate scientists and collaborators about these standards 
  • Perform data wrangling tasks including cleaning, transforming, and labeling datasets and developing relevant schemas for storing that data
  • Maintain quality control and integrity of current and archived data
  • Build data models and processes based on business and technical requirements to channel data from multiple inputs through data pipelines, ensuring successful processing and data validity

Position Requirements: 

  • 3+ years’ experience curating (organizing, cleaning, and efficiently manipulating) scientific datasets 
  • Advanced knowledge of biology (degree in life sciences or computational biology, and/or experience working in a biology lab environment) 
  • Detail-oriented with strong organizational, project management and analytical skills 
  • Ability to work effectively with scientists and engineers to elucidate and translate data organization needs into written requirements and specifications
  • Ability to understand scientific literature, experimental procedures and their limitations, and current needs of the research community
  • Knowledge of SQL; familiarity with relational databases, relational data concepts and data modeling 
  • Ability to clearly and concisely communicate technical, scientific and non-technical information, both verbally and in writing 
  • Experience writing shell scripts and/or Python – including basic data extraction, transformation, loading, and analysis scripts
  • Must be willing to work onsite at least 4 days a week

Nice to Have:

  • Familiarity with controlled vocabularies and ontologies 
  • Advanced knowledge of bioinformatics, genomics, and proteomics methods
  • Advanced knowledge of data structures and formats used in scientific approaches
  • Experience assisting clinical personnel in data and metadata submission
  • Understanding of current regulatory guidelines, GCP, and industry standards, practices, and terminologies regarding data management 
  • Ability to provide product specification and review as part of software development 
  • Experience with Unix tools for data manipulation 
  • Familiarity with software development processes in a collaborative setting, e.g. reading and reviewing teammates’ code in GitHub or similar source control
  • Experience interacting with information systems programmatically via a web API 
  • Experience with data quality assessment
  • Applied Machine Learning experience for curation of historical / legacy lab data 

The estimated base salary range for this role is $173,000 - $180,000. Actual pay will be based on a number of factors including experience and qualifications. This position is also eligible for two annual cash bonuses.

Top Skills

Python
SQL
Unix
HQ

Calico Life Sciences South San Francisco, California, USA Office

1170 Veterans Blvd, South San Francisco, California, United States

Similar Jobs

8 Hours Ago
In-Office
San Jose, CA, USA
Internship
Internship
Semiconductor
This internship role involves working on novel AI technologies and chip design, contributing significantly to product development while collaborating with experienced professionals.
Top Skills: Cloud ServicesContainerized ApplicationsPythonSoftware Development Practices
8 Hours Ago
Easy Apply
Hybrid
2 Locations
Easy Apply
Expert/Leader
Expert/Leader
Fintech • Mobile • Software • Financial Services
The Principal Software Engineer will lead the development of scalable data platforms, driving innovation and collaboration across teams while mentoring junior staff and optimizing operations.
Top Skills: AirflowAWSCloudFormationDbtDockerJavaKafkaKubernetesPythonSnowflakeSQLTerraform
8 Hours Ago
Easy Apply
Hybrid
2 Locations
Easy Apply
Expert/Leader
Expert/Leader
Fintech • Mobile • Software • Financial Services
The Principal Software Engineer leads the development of scalable data platforms, collaborating with cross-functional teams, optimizing data pipelines and mentoring engineers to drive innovation and operational excellence.
Top Skills: AirflowAWSCloudFormationDbtDockerJavaKafkaKubernetesPythonSnowflakeSQLTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account