Gemini Logo

Gemini

Staff Data Engineer

Posted 11 Days Ago
Easy Apply
In-Office
2 Locations
168K-240K Annually
Senior level
Easy Apply
In-Office
2 Locations
168K-240K Annually
Senior level
Lead design and implementation of scalable batch and streaming data infrastructure. Build and maintain ETL/ELT pipelines, ensure data quality and observability, troubleshoot production issues, mentor engineers, and partner with cross-functional teams to deliver reliable, self-serve data products.
The summary above was generated by AI

About the Company

Gemini is a global crypto and Web3 platform founded by Cameron and Tyler Winklevoss in 2014, offering a wide range of simple, reliable, and secure crypto products and services to individuals and institutions in over 70 countries. Our mission is to unlock the next era of financial, creative, and personal freedom by providing trusted access to the decentralized future. We envision a world where crypto reshapes the global financial system, internet, and money to create greater choice, independence, and opportunity for all — bridging traditional finance with the emerging cryptoeconomy in a way that is more open, fair, and secure. As a publicly traded company, Gemini is poised to accelerate this vision with greater scale, reach, and impact.

The Department: Data

At Gemini, our Data Team is the engine that powers insight, innovation, and trust across the company. We bring together world-class data engineers, platform engineers, machine learning engineers, analytics engineers, and data scientists — all working in harmony to transform raw information into secure, reliable, and actionable intelligence. From building scalable pipelines and platforms, to enabling cutting-edge machine learning, to ensuring governance and cost efficiency, we deliver the foundation for smarter decisions and breakthrough products. We thrive at the intersection of crypto, technology, and finance, and we’re united by a shared mission: to unlock the full potential of Gemini’s data to drive growth, efficiency, and customer impact.

The Role: Staff Data Engineer

The Data team is responsible for designing and operating the data infrastructure that powers insight, reporting, analytics, and machine learning across the business. As a Staff Data Engineer, you will lead architectural initiatives, mentor others, and build high-scale systems that impact the entire organization. You will partner closely with product, analytics, ML, finance, operations, and engineering teams to move, transform, and model data reliably, with observability, resilience, and agility.

This role is required to be in person twice a week at either our San Francisco, CA or New York City, NY office.

Responsibilities:

  • Lead the architecture, design, and implementation of data infrastructure and pipelines, spanning both batch and real-time / streaming workloads
  • Build and maintain scalable, efficient, and reliable ETL/ELT pipelines using languages and frameworks such as Python, SQL, Spark, Flink, Beam, or equivalents
  • Work on real-time or near-real-time data solutions (e.g. CDC, streaming, micro-batch) for use cases that require timely data
  • Partner with data scientists, ML engineers, analysts, and product teams to understand data requirements, define SLAs, and deliver coherent data products that others can self-serve
  • Establish data quality, validation, observability, and monitoring frameworks (data auditing, alerting, anomaly detection, data lineage)
  • Investigate and resolve complex production issues: root cause analysis, performance bottlenecks, data integrity, fault tolerance
  • Mentor and guide more junior and mid-level data engineers: lead code reviews, design reviews, and best-practice evangelism
  • Stay up to date on new tools, technologies, and patterns in the data and cloud space, bringing proposals and proof-of-concepts when appropriate
  • Document data flows, data dictionaries, architecture patterns, and operational runbooks

Minimum Qualifications:

  • 8+ years of experience in data engineering (or similar) roles
  • Strong experience in ETL/ELT pipeline design, implementation, and optimization
  • Deep expertise in Python and SQL writing production-quality, maintainable, testable code
  • Experience with large-scale data warehouses (e.g. Databricks, BigQuery, Snowflake)
  • Solid grounding in software engineering fundamentals, data structures, and systems thinking
  • Hands-on experience in data modeling (dimensional modeling, normalization, schema design)
  • Experience building systems with real-time or streaming data (e.g. Kafka, Kinesis, Flink, Spark Streaming), and familiarity with CDC frameworks
  • Experience with orchestration / workflow frameworks (e.g. Airflow)
  • Familiarity with data governance, lineage, metadata, cataloging, and data quality practices
  • Strong cross-functional communication skills; ability to translate between technical and non-technical stakeholders
  • Proven experience in mentoring, leading design discussions, and influencing data-engineering best practices across teams

Preferred Qualifications:

  • Experience with crypto, financial services, trading, markets, or exchange systems
  • Experience with blockchain, crypto, Web3 data — e.g. blocks, transactions, contract calls, token transfers, UTXO/account models, on-chain indexing, chain APIs, etc.
  • Experience with infrastructure as code, containerization, and CI/CD pipelines
  • Hands-on experience managing and optimizing Databricks on AWS
It Pays to Work Here
 
The compensation & benefits package for this role includes:
  • Competitive starting pay
  • A discretionary annual bonus
  • Long-term incentive in the form of a new hire equity grant
  • Comprehensive health plans
  • 401K with company matching
  • Paid Parental Leave
  • Flexible time off

Salary Range: The base salary range for this role is between $168,000 - $240,000 in the State of New York, the State of California and the State of Washington. This range is not inclusive of our discretionary bonus or equity package. When determining a candidate’s compensation, we consider a number of factors including skillset, experience, job scope, and current market data.

In the United States, we offer a hybrid work approach at our hub offices, balancing the benefits of in-person collaboration with the flexibility of remote work. Expectations may vary by location and role, so candidates are encouraged to connect with their recruiter to learn more about the specific policy for the role. Employees who do not live near one of our hubs are part of our remote workforce.

At Gemini, we strive to build diverse teams that reflect the people we want to empower through our products, and we are committed to equal employment opportunity regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, or Veteran status. Equal Opportunity is the Law, and Gemini is proud to be an equal opportunity workplace. If you have a specific need that requires accommodation, please let a member of the People Team know.

#LI-AA1

Top Skills

Python,Sql,Databricks,Bigquery,Snowflake,Kafka,Kinesis,Flink,Spark Streaming,Cdc Frameworks,Airflow

Similar Jobs

6 Days Ago
Hybrid
New York, NY, USA
190K-225K Annually
Senior level
190K-225K Annually
Senior level
Hardware • Healthtech • Software • Analytics
Lead the development and architecture of Sage Insight, a data platform, improving user experience through technical leadership and collaboration with product teams.
Top Skills: JavaScriptPalantir FoundryPythonTypescript
13 Days Ago
In-Office or Remote
12 Locations
195K-258K Annually
Senior level
195K-258K Annually
Senior level
Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3
Design, build, and operate the core data warehouse, ingestion, orchestration, and cataloging platform. Develop batch and streaming pipelines, ensure data quality, governance, observability, and provide ML data platform capabilities. Lead architecture, improve platform reliability and performance, and collaborate with product, engineering, data science, security, and compliance teams.
Top Skills: Apache Flink,Google Cloud Dataflow,Bigtable,Cassandra
12 Days Ago
In-Office or Remote
2 Locations
Senior level
Senior level
Other • Sales
Founding technical leader for the data platform: build and operate ingestion, enrichment, batch and realtime pipelines; design storage, modeling, and governance; integrate LLM/ML workflows and vector search; hire and lead a small data team; and collaborate with engineering, product, and leadership to productionize data for valuations, analytics, and APIs.
Top Skills: Python,Sql,Postgresql,Snowflake,Bigquery,Redshift,Dbt,Fivetran,Apache Airflow,Dagster,Prefect,Temporal,Aws,Gcp,Azure,Docker,Terraform,Ci/Cd,Data Observability,Openai,Anthropic,Llm Apis,Cursor,Claude,Copilot,Ocr,Nlp,Vector Databases,Pinecone,Weaviate,Pgvector,Web Scraping,Entity Resolution

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account