Insitro Logo

Insitro

Machine Learning Intern

Reposted 2 Hours Ago
Be an Early Applicant
In-Office
South San Francisco, CA, USA
35-65 Hourly
Internship
In-Office
South San Francisco, CA, USA
35-65 Hourly
Internship
Partner with a mentor to develop and apply ML methods on large multi-modal biological datasets (omics, imaging, genetics). Prototype, productionize, and deploy models, analyze results, and collaborate with cross-functional scientists over an 11–12 week summer internship.
The summary above was generated by AI
The Opportunity

Global drug development productivity is declining, with an overall failure to develop effective treatments for many increasingly prevalent complex diseases affecting millions of patients per year. We seek to tackle this by combining innovative machine learning techniques with pioneering technologies that measure multiple cellular aspects, aiming to drastically improve and accelerate how drugs are discovered and developed.

We are looking for highly motivated interns to join our compute team as a machine learning scientist looking to work at the intersection of machine learning and life sciences for our Summer 2026 cohort.

You will partner directly with a team mentor in developing and/or applying ML methods to a process and analyze large scale datasets from multiple modalities over the course of the summer (11-12 weeks). These internships can based in on South San Francisco headquarter with a hybrid work schedule or can be remote based on the team mentor's location and business need. Compute is a diverse team that works across the company spanning imaging, omics, statistical genetics, pan-modality therapeutics discovery, clinical research, and research software engineering.

Example of areas & topics you will be working on:
  • Computational Biology:

    • Leverage publicly available single cell transcriptomics resources to extract insights about disease mechanisms relevant to the therapeutic areas;

  • Methods for Omics & Imaging data modalities:

    • Develop, productionize, and deploy cutting-edge ML approaches to integrate large-scale multi-modal phenotypic datasets;

  • Statistical and Translational Genetics:

    • Develop workflows to enable post-GWAS (Genome-Wide Association Scan) analysis of results, e.g., fine-mapping

    • Translational genetics deep dives: enabling higher throughput annotation and exploration of candidate genes from our discovery efforts

    • Design of statistical methods to improve rare variant burden tests, and methods to improve power in longitudinal phenotypes

  • Integrative Phenotyping:

    • Develop ML models for imputing disease-relevant phenotypes from high-content clinical imaging datasets, e.g., MRI, PET-CT

    • Develop ML methods for disentangling and genetically interpreting axes of variation in complex phenotypes

    • Use LLMs to extract disease-relevant information from medical records

  • Molecular Machine Learning:

    • Explore generative models of small molecules, biologics, and/or oligonucleotide therapeutics in various data modalities such as 2D and 3D representations for hit-to-lead drug discovery efforts.

    • Develop new geometric deep learning methods to better characterize nuanced molecular properties and relationships.

  • Computational Microscopy:

    • Identify and prototype novel microscopy-driven phenotyping workflows, including hardware acquisition, post-processing, and featurization

    • Develop robust software tooling to support the deployment of new and existing methods for general use by insitro scientists

    • Optimize existing microscopy acquisition methods in both hardware and software, using ML feature outputs to benchmark improvements

What you will learn through this experience:
  • In the course of the internship you will learn diverse machine learning techniques and rigorously analyze complex dataset and design metrics to ensure robustness of our methods.

  • You can expect to develop and prototype solutions to enable ML based decisions in our workflows.

  • You will work closely with machine learning engineers and scientists, biologists, chemists, microscopy experts, and automation engineers.

  • You will be mentored by one of our senior researchers, who has significant experience in machine learning and/or data science.

  • You will also attend our machine learning team meetings and be exposed to a diverse set of novel technologies and machine learning concepts that tackle various biological questions.

In return, we will support you by:
  • Placing a high degree of trust in your ideas and execution.

  • Bringing you up to speed in the domain ML enabled drug discovery.

  • Striving to provide a low-stress work environment.

  • Making ourselves available for collaboration.

  • Caring about you as a whole person - not a resource.

  • Being a well funded startup with a stable runway.

About You
  • Working towards a BS, MS, or Ph.D. in engineering, computational biology, systems biology, computer science, mathematics, statistics, life science, chemistry, physics, or a related field.

  • Proficiency in one or more general-purpose programming languages. We primarily use Python.

  • Interest in using and developing brand new statistical and machine learning methods inspired by real problems.

  • Curiosity about human physiology or disease biology.

  • Committed to writing high-quality, well-commented code and documentation.

  • Ability to communicate effectively and collaborate with people of diverse backgrounds and job functions.

  • Passion for making a difference in the world.

Nice to Have
  • First-hand experience with biological data, preferably using computational approaches.

  • Passion for learning how to work with diverse functional genomic assays (RNA/DNase/ATAC/ChIP-seq, etc).

  • Interest in learning how to analyze single-cell RNA-seq data.

  • Solid understanding of computational chemistry, including virtual screening (classic QSAR modeling, structure based drug-discovery), library design, etc.

  • Demonstrated ability to use and develop cutting edge statistical and machine learning methods inspired by real problems.

  • Experience with machine and deep Learning frameworks (e.g., scikit-learn, PyTorch, etc.).

  • Demonstrated ability to write high-quality, production-ready code (readable, well-tested, with well-designed APIs).

  • Experience in Linux environment, database languages (e.g., SQL, No-SQL) and version control practices and tools such as Git.

  • Publications of high-quality work in relevant computational biology, bioinformatics, systems biology, life sciences, or biomedical venues, including journals and conferences.

  • Passionate about solving problems, asking questions and learning independently.

  • Familiarity with the SciPy/PyData ecosystem (numpy, pandas, scipy, dask etc.).

  • Familiarity with cloud computing services (AWS or GCP).

  • Familiarity with statistical analysis software, e.g., R.

 

 Compensation & Benefits at insitro

Our target starting salary for successful US-based applicants for this role is $35/hr - $65/hr. We may also adjust this range in the future based on market data.

In addition, insitro also provides our interns:

  • Excellent medical, dental, and vision coverage as well as mental health and well-being support

  • Access to free onsite baristas and daily lunch for interns who are onsite

  • Access to a free commuter bus network that provides transport to and from our South San Francisco HQ from locations all around the Bay Area

insitro is an equal opportunity employer. All applicants will be considered for employment without attention to race, color, religion, sex, sexual orientation, gender identity, national origin, veteran or disability status.

We believe diversity, equity, and inclusion need to be at the foundation of our culture. We work hard to bring together diverse teams–grounded in a wide range of expertise and life experiences–and work even harder to ensure those teams thrive in inclusive, growth-oriented environments supported by equitable company and team practices. All candidates can expect equitable treatment, respect, and fairness throughout the interview process.

Please be aware of recruitment scams: we never request payments, all recruitment communications are from @insitro.com, and if in doubt, contact us at [email protected].

#LI-Hybrid

About insitro

insitro is a drug discovery and development company using machine learning (ML) and data at scale to decode biology for transformative medicines. At the core of insitro’s approach is the convergence of in-house generated multi-modal cellular data and high-content phenotypic human cohort data. We rely on these data to develop ML-driven, predictive disease models that uncover underlying biologic state and elucidate critical drivers of disease. These powerful models rely on extensive biological and computational infrastructure and allow insitro to advance novel targets and patient biomarkers, design therapeutics and inform clinical strategy. insitro is advancing a wholly owned and partnered pipeline of insights and therapeutics in neuroscience and metabolism. Since launching in 2018, insitro has raised over $700 million from top tech, biotech and crossover investors, and from collaborations with pharmaceutical partners. For more information on insitro, please visit www.insitro.com.

Top Skills

Python

Insitro San Francisco, California, USA Office

259 E Grand Ave, San Francisco, CA, United States, 94080

Similar Jobs

13 Days Ago
Hybrid
43-48 Hourly
Internship
43-48 Hourly
Internship
Computer Vision • Hardware • Machine Learning • Software • Semiconductor • Quantum Computing • Defense
Conduct research in graph machine learning, develop algorithms for graph analytics, integrate with foundation models, and publish findings.
Top Skills: C++JavaLpgNeo4JPythonPyTorchRayRdfSparkTigergraph
Yesterday
In-Office
South San Francisco, CA, USA
Internship
Internship
Biotech
As a Machine Learning Research Intern, you will develop and evaluate prediction models for biological impact using large-scale datasets, collaborating closely with the ML team.
Top Skills: JaxPyTorchTensorFlow
5 Days Ago
In-Office or Remote
United States
Internship
Internship
Software
Interns will contribute to developing GenAI and ML systems, optimizing AI workflows, and analyzing performance in various computing environments.
Top Skills: C++DeepspeedDockerGoHuggingfaceJavaKubernetesLlama.CppPythonPyTorchRaySglangTensorrt-LlmTgiVllm

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account