Profluent Logo

Profluent

Senior Software Engineer, Data Platform

Reposted 3 Days Ago
In-Office
Emeryville, CA, USA
170K-220K Annually
Senior level
In-Office
Emeryville, CA, USA
170K-220K Annually
Senior level
As a Senior Software Engineer, you will design and maintain data infrastructure for protein engineering, ensuring secure data pipelines and compliance, while collaborating with multi-disciplinary teams.
The summary above was generated by AI

Profluent is an AI-first protein design company. Founded in 2022, we develop deep generative models to design and validate novel, functional proteins to revolutionize biomedicine. Based in Emeryville, CA, we are backed by leading investors including Altimeter Capital, Bezos Expeditions, Spark Capital, Insight Partners, Air Street Capital, AIX Ventures, and Convergent Ventures, and have raised over $150M to date.

We’re looking for a Senior Software Engineer to help design, build, and scale Profluent’s data platform. This platform houses data from protein engineering campaigns, including protein designs, experimental results, partner datasets, analytical outputs, and model-ready training data. It enables rapid machine learning, biological discovery, and secure collaboration across internal and external programs.

This role is ideal for an engineer who enjoys building robust data systems: secure ingestion pipelines, well-structured warehouses, reliable data models, access controls, auditability, and infrastructure that makes complex scientific data usable at scale. You will work closely with ML, bioinformatics, and program teams to ensure Profluent’s data is organized, governed, accessible, and protected.

Responsibilities

  • Design, build, and maintain scalable data infrastructure for protein engineering campaigns, including ingestion, transformation, validation, storage, and retrieval of large scientific datasets
  • Develop secure data pipelines for internal and partner-generated data, with strong attention to access control, data siloing, provenance, auditability, and compliance with data use restrictions
  • Own core components of Profluent’s data warehouse and data platform, using Python, GCP, PostgreSQL, BigQuery, and related cloud-native technologies
  • Build systems that transform raw experimental, computational, and partner data into structured, reliable, analysis-ready and model-ready datasets
  • Establish best practices for data modeling, metadata management, data quality checks, schema evolution, versioning, and documentation
  • Collaborate with ML engineers, computational biologists, data scientists, and program stakeholders to understand data requirements and translate them into scalable technical systems
  • Improve engineering quality through thoughtful system design, code review, testing, CI/CD, observability, and maintainable development workflows
  • Contribute to architectural decisions for how Profluent stores, secures, organizes, and uses data across programs and partnerships

Qualifications

  • 5+ years of software engineering, data engineering, or data platform experience
  • Strong proficiency in Python and modern software development practices, including git, testing, code review, CI/CD, and production deployment
  • Experience designing and operating production data pipelines, data warehouses, and data models at scale
  • Hands-on experience with cloud platforms, preferably GCP, and technologies such as BigQuery, PostgreSQL, object storage, workflow orchestration, and containerized services
  • Strong understanding of data security, access control, data partitioning or siloing, audit logging, and managing sensitive or restricted datasets
  • Experience working with complex, heterogeneous datasets and building systems that make them reliable, discoverable, and usable
  • Ability to work independently, make sound technical decisions, and drive projects from ambiguous requirements to production systems
  • BS, MS, or PhD in Computer Science, Engineering, Data Science, Bioinformatics, or a related technical field, or equivalent practical experience

Preferences (but not required)

  • Experience with scientific, biological, clinical, genomic, laboratory, or high-throughput experimental data
  • Experience managing external partner, customer, or restricted-access datasets
  • Familiarity with data governance, lineage, metadata systems, schema registries, or data catalogs
  • Experience with research data systems, LIMS, ELNs, Benchling, or adjacent scientific platforms
  • Background working with ML, data science, computational biology, or cross-disciplinary technical teams
  • Interest in learning biology, gene editing, protein design, or machine learning concepts

What We Offer

  • High-growth opportunity with meaningful impact on the future of protein design
  • Competitive compensation package with equity participation
  • 401(k) with a strong employer match
  • Comprehensive benefits including health/dental/vision insurance
  • Generous PTO policy and commitment to work-life balance
  • Professional development opportunities in a cutting-edge field at the intersection of AI and biology

Profluent Bio, Inc is an equal opportunity employer promoting diversity and inclusion in the workspace. We do not discriminate on the basis of race, color, religion, marital status, age, national origin, ancestry, physical or mental disability, medical conditions, veteran status, sexual orientation, gender (including gender identity and gender expression), sex (which includes pregnancy, childbirth, and breastfeeding), genetic information, taking or requesting statutorily protected leave, or any other basis protected by law.

Work Authorization Requirement

Applicants must have ongoing work authorization in the United States that does not require employer sponsorship. Sponsorship will not be provided now or at any time in the future for this position.

Employment Eligibility Verification

Legal authorization to work in the United States is required. In compliance with federal law, all persons hired must verify their identity and work eligibility and complete the required employment verification form upon hire.

Hiring Salary Range
$170,000$220,000 USD
HQ

Profluent Berkeley, California, USA Office

Berkeley, California, United States

Similar Jobs

17 Days Ago
Hybrid
San Francisco, CA, USA
158K-260K Annually
Senior level
158K-260K Annually
Senior level
Artificial Intelligence • Fintech • Payments • Business Intelligence • Financial Services • Generative AI
Lead architecture and build of a global, high‑performance data and AI platform. Own design, performance tuning, and SDLC for distributed systems (streaming, storage, indexing) powering real‑time analytics, vector search, and AI-driven products. Mentor engineers and drive engineering rigor and shared tooling.
Top Skills: AirflowBigQueryDatabricksDataflowDataprocFlinkGCPGoGoogle Cloud StorageJavaKafkaKotlinKubernetesMcpOlap EnginesRagSparkVector Indexing
10 Days Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
131K-220K Annually
Senior level
131K-220K Annually
Senior level
Artificial Intelligence • Cloud • Computer Vision • Hardware • Internet of Things • Software
Design, build, and operate scalable, secure data platform infrastructure for ingesting, processing, cataloging, and accessing petabytes of data. Improve Spark/Databricks reliability and developer experience, build ingestion/replication systems, develop internal libraries and tooling (Go/Python), and collaborate with cross-functional teams to support analytics, ML, and customer-facing data products.
Top Skills: AirflowAmundsenSparkAws RdsCloudFormationDagsterDatabricksDatahubDelta LakeDockerDynamoDBEcsFargateGoHive MetastoreHudiIcebergJavaKinesisKubernetesLambdaPrefectPythonS3ScalaSqsTerraformUnity Catalog
4 Days Ago
Hybrid
San Francisco, CA, USA
190K-250K Hourly
Senior level
190K-250K Hourly
Senior level
Information Technology • Other • Security • Social Impact • Software • Cybersecurity • Data Privacy
Design, build, and operate Persona's large-scale data platform: datalake and lakehouse, streaming and batch ingestion (including CDC), high-throughput data-movement services, real-time serving layer, and query/analytics capabilities. Drive architecture decisions, performance, reliability, and cost, and lead multi-quarter infrastructure projects while mentoring engineers and partnering with infrastructure, data science, and product teams.
Top Skills: Change Data Capture (Cdc)ClickhouseDuckdbFlinkGoGoogle Cloud (Gcp)IcebergKafkaKubernetesMongoDBMySQLPythonSnowflakeSpark

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account