DataPelago Logo

DataPelago

Data Processing Engineer - I/O

Reposted 5 Days Ago
In-Office or Remote
Hiring Remotely in Mountain View, CA
Senior level
In-Office or Remote
Hiring Remotely in Mountain View, CA
Senior level
The Data Processing Engineer will architect and enhance data read/write capabilities for a data processing engine, focusing on large-scale data, performance optimization, and collaboration with engineering teams.
The summary above was generated by AI

Data Processing Engineer - I/O
Mountain View, CA / Hyderabad, IN / Remote

About DataPelago:

DataPelago is at the forefront of revolutionizing data processing for traditional analytics and cutting-edge GenAI preprocessing. We are building an innovative data processing engine that is transforming how Apache Spark, Apache Flink, Ray and others operate on diverse, large-scale data. Our team of engineers drive and adopt advances in hardware-accelerated computing, parallel processing of large-scale data, query optimization, distributed systems, compilers, machine learning, and cloud-native computing. We are looking for specialists to join our engineering team and shape the future of accelerated data processing.

The Opportunity:
As a Data Processing Engineer - I/O, you will be a key individual contributor in advancing data
read and write capabilities of DataPelago’s data processing engine. You will enhance functional
breadth, performance, scale, and reliability of the DataPelago engine in reading and writing large scale data of various data types from diverse data sources and data sinks. This is a unique opportunity to make a significant impact on a category-defining product and work with a talented team of engineers.

What You'll Do:
• Architect: Influence the architecture of how our data processing engine interfaces with data
sources and sinks, catalogs, data formats.
• Design: Lead design of functional and performance enhancements to adapters/connectors,
data representations, data filtering, caching and more in our data processing engine.

• Core Development: Individually design, implement, test, optimize, and maintain components of the data processing engine.

• Innovation and Differentiation: Analyze technology roadmap of existing and emerging data
formats and libraries, open table formats, catalog services, and more (e.g., Apache Arrow,

Apache Parquet, Apache Iceberg) and identify opportunities for our engine to enhance technology and product leadership.

• Collaboration: Partner effectively with engineering and product management in defining and
realizing the data I/O roadmap of our product..
• Continuous Improvement: Foster best practices in design and code reviews, testing, CI/CD,
and issue resolution to maintain highest product quality, security, efficiency, & productivity.

What You'll Bring:

• Bachelor's degree in Computer Science or a related field with 7+ years of relevant experience OR a Master's degree in Computer Science or a related field with 5+ years of relevant

experience.

• 3+ years of deep technical experience in developing and optimizing data read and write interfaces for large-scale data processing, particularly related to Apache Parquet, Apache

ORC, Apache Iceberg, Apache Spark, and similar technologies.

• Demonstrated experience in instrumenting, analyzing, and optimizing the performance of
data processing engine components on benchmark and customer workloads.

• Demonstrated experience in the design, development, and successful release of high-performance data processing engine features for large production deployments.

• Good knowledge of the architecture of one or more of Apache Spark, Apache Flink, Presto/
Trino.
• Exceptional programming skills in C, C++. Rust experience preferred.
• Extensive development experience in Linux environments.
• Strong analytical and problem-solving skills with a passion for performance optimization.

Location Considerations:

We value face-to-face collaboration, but recognize that talent can be found anywhere. Our engineering team works at our headquarters in Mountain View, CA, at our India office in Hyderabad, and at remote locations.

Why Join DataPelago?
• Technology Leadership: Shape the architecture and development of how our core engine
works with advanced data store platforms.
• Cutting-Edge Innovation: Work on challenging problems at the forefront of accelerated
computing and data processing.
• Significant Impact: Your contributions will directly impact the performance and scalability
of our mission-critical platform.
• Growth: Expand your technical expertise and scope of responsibilities working with other
talented engineers and with a growing product.

• Competitive compensation, stock options, comprehensive benefits package, leadership de-
velopment opportunities.

HQ

DataPelago Mountain View, California, USA Office

100 View Street, Suite 102, Mountain View, CA, United States, 94041

Similar Jobs

5 Hours Ago
Remote or Hybrid
United States
106K-141K Annually
Mid level
106K-141K Annually
Mid level
Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Manage ACDelco distribution and dealer relationships within Northern California to drive sales, execute marketing and inventory programs, train and direct sales reps, prospect new distribution opportunities, support forecasting and warranty initiatives, and travel frequently to meet business objectives.
Top Skills: ExcelGm Parts SystemsMs OutlookPowerPoint
5 Hours Ago
Remote or Hybrid
United States
106K-141K Annually
Mid level
106K-141K Annually
Mid level
Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Lead ACDelco growth across a multi-state territory by driving sales, program participation, inventory placement, and marketing execution with Direct and National Accounts. Develop and direct sales reps, prospect distribution opportunities, execute local promotions, support training, monitor warranty/returns, provide market input for forecasting, and maintain customer relationships. Role requires frequent travel and independent field leadership to meet regional and national revenue targets.
Top Skills: ExcelGm Parts SystemsMs OutlookPowerPoint
5 Hours Ago
Remote or Hybrid
United States
107K-175K Annually
Senior level
107K-175K Annually
Senior level
Automotive • Big Data • Information Technology • Robotics • Software • Transportation • Manufacturing
Design, build, and maintain scalable BI solutions and Power BI dashboards for CRM performance and lifecycle analytics. Integrate AEP, RTCDP, and CRM data, ensure data quality and governance, automate reporting across brands/markets, support attribution and migration to cloud, and collaborate with cross-functional teams to deliver actionable marketing insights.
Top Skills: Adobe Experience Platform (Aep)AzureCrm SystemsExcelOracle Pl/SqlPower BIRtcdpSQLTableauTeradata

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account