Augury Logo

Augury

Software Data Engineer, Data Platform

Posted 3 Days Ago
Easy Apply
Remote
Hiring Remotely in Haifa
Senior level
Easy Apply
Remote
Hiring Remotely in Haifa
Senior level
As a Software Data Engineer, you'll build production-grade data systems and pipelines, design data flows, ensure data quality, and collaborate with teams to enhance data-driven applications.
The summary above was generated by AI

Our mission is to transform how people and machines work together to push the boundaries of human productivity. A leader in Industrial AI, Augury helps the world’s manufacturers leverage real-time production insights to drive new levels of efficiency. Combining predictive and prescriptive AI technology with industry expertise, production teams can proactively address alerts, minimize downtime, reduce asset costs, and maximize yield and capacity. Our customers achieve payback in six months or less, enabling global scale. We're looking for team members excited to partner with the world's manufacturers and build the future of production together.

You are a Software Data Engineer with deep experience building data-intensive systems, not a traditional ETL or BI-focused Data Engineer. In this role, you will design and build production-grade data services, platforms, and pipelines that power DIH and our AI-driven products. You will combine strong software engineering fundamentals with modern data engineering practices, with a focus on clean architecture, reliability, scalability, observability, and testing.

As a Software Data Engineer, Data Platform, you will:

  • Build and evolve Python-based services and pipelines that ingest raw industrial events, store them reliably, and expose clean, well-modeled tables and APIs for downstream consumers, including Digital Twin, Smart Canvas, AI agents, and analytics.
  • Design systems that handle duplicates, invalid data, late-arriving events, and reprocessing in a principled, incremental, and reproducible manner.
  • Collaborate with platform, machine learning, and product teams across Israel and globally to transform complex data challenges into robust, observable, and scalable software solutions.

A Day in Your Life

Production Data Systems & Pipelines

  • Design and implement end-to-end data flows, from raw event ingestion into durable storage to modeled datasets and aggregates that power products, Digital Twin capabilities, analytics, and AI agents.
  • Build idempotent pipelines that can safely re-run without corrupting data, using deterministic keys and clearly defined contracts between raw, curated, and modeled datasets.
  • Implement incremental aggregations (e.g., machine signal summaries, production metrics, and operational KPIs) that correctly account for late-arriving data, watermarking strategies, and reproducibility requirements.
  • Model relationships and context across machines, lines, factories, sensors, work orders, and operational events to support context-aware applications, knowledge graphs, and AI agents.
  • Partner with platform teams to define how datasets are stored within our lakehouse, Digital Twin, and context graph architectures and exposed through well-defined APIs and tools.

Software Engineering & Data Quality

  • Write clean, maintainable Python services with clear separation of concerns across ingestion, validation, transformation, persistence, aggregation, and orchestration layers.
  • Apply strong data modeling and SQL fundamentals, including schema design, indexing strategies, event-time semantics, and scalable aggregation patterns.
  • Drive testing discipline across the data platform, including unit tests, data-quality tests, integration tests, and validation frameworks.
  • Design for observability through metrics, logging, tracing, and monitoring that simplify debugging, improve data quality visibility, and support production operations.
  • Troubleshoot and resolve production data issues, including incorrect aggregations, missing data, duplicate records, schema evolution challenges, and backfill operations.

Streaming, Lakehouse & Scalability

  • Build and evolve systems that scale from local development environments to cloud-scale lakehouse architectures using technologies such as Databricks, Delta Lake, and Spark.
  • Design and implement data pipelines following modern lakehouse patterns, including Bronze, Silver, and Gold layers, partitioning strategies, and cost-efficient compute utilization.
  • Work with streaming and messaging platforms (Kafka, Pub/Sub, or similar) to build reliable, idempotent consumers, replay capabilities, and reprocessing workflows.
  • Contribute to multi-tenant data architectures, data contracts, and governance practices that enable secure and efficient access to customer data at scale.

Collaboration & AI-Native Experiences

  • Work closely with DIH, Smart Canvas, and AI teams to define how agents interact with structured data, context graphs, APIs, and tools in deterministic and reliable ways.
  • Translate product requirements and user needs into technical designs that balance correctness, performance, latency, cost, and long-term maintainability.
  • Participate in architecture reviews, design discussions, code reviews, and collaborative development practices that raise the overall engineering bar across the organization.
  • Help shape the future of AI-native experiences by building the data foundations that power intelligent applications and agentic workflows.

What You Bring

  •       Bachelor's degree in Computer Science, Software Engineering, Data Engineering, Information Systems, or a related engineering discipline, or equivalent practical experience.
  • 5+ years of professional software engineering experience, including substantial experience building backend systems, distributed systems, or data-intensive applications in production environments.
  • Strong Python engineering skills, including modular architecture, dependency management, testing practices, observability, and production-grade code quality.
  • Strong SQL and data modeling expertise, including schema design, indexing strategies, event-driven data models, and scalable analytical aggregations.
  • Hands-on experience building incremental and idempotent data pipelines that handle duplicate, invalid, and late-arriving events without impacting downstream consumers.
  • Experience with at least one major cloud platform (Azure, GCP, or AWS) and modern lakehouse technologies such as Databricks, Delta Lake, Spark, or equivalent architectures.
  • Experience with streaming or messaging technologies such as Kafka, Pub/Sub, Event Hubs, or similar event-driven systems.
  • Proven ability to diagnose and resolve production data issues, including data quality problems, schema evolution, backfills, replay scenarios, and performance bottlenecks.
  • Strong written and verbal communication skills in English and experience collaborating effectively with globally distributed teams.

Nice to Have

  • Experience building industrial, IoT, manufacturing, or operational data platforms.
  • Familiarity with Digital Twin architectures and industrial data models.
  • Experience with graph databases, context graphs, knowledge graphs, or relationship-centric data modeling.
  • Exposure to AI/LLM-powered applications, including retrieval-augmented generation (RAG), agents, tool calling, or evaluation frameworks.
  • Experience working with Databricks or similar lakehouse platforms from both application and platform perspectives.
  • Experience building data products that directly support AI agents, intelligent applications, or machine learning workflows.

Perks

  • Stock options
  • Paid parental leave
  • Flex PTO
     

Augury is a people-first organization. We believe in fostering an inclusive environment in which employees feel encouraged to share their unique perspectives, leverage their strengths, and act authentically. We know that diverse teams are strong teams, and we welcome those from all backgrounds and varying experiences. We are committed to providing employees with a work environment free of discrimination and harassment. We believe that diversity is more than just good intentions, and we are committed to creating an inclusive environment for all employees.

Augury is a proud equal opportunity employer, we strive to create a work environment in which everyone, all applicants, employees, customers, guests, and vendors feel safe and comfortable. We commit to maintain a workplace that is free of any type of harassment and does not tolerate anyone intimidating, humiliating, or hurting others. We prohibit willful discrimination based on age, gender, ethnicity, race, color, religion, political opinions, sexual orientation, sexual identity or expression, military or veteran status, disability or any other characteristic protected by law.

Similar Jobs at Augury

12 Hours Ago
Easy Apply
Remote
Easy Apply
Senior level
Senior level
Artificial Intelligence • Hardware • Internet of Things • Machine Learning • Software • Manufacturing
The Senior AI Software Engineer will build GenAI infrastructure, design AI agents, integrate workflows, and improve AI system reliability and efficiency.
Top Skills: Azure OpenaiDatabricksDockerGoKubernetesLlm PlatformsMongoDBNeo4JPythonRest Apis
10 Days Ago
Easy Apply
Remote
Easy Apply
Senior level
Senior level
Artificial Intelligence • Hardware • Internet of Things • Machine Learning • Software • Manufacturing
As an AI-Native Mobile Engineer, you will design and build cross-platform mobile applications using Flutter, integrate IoT features, and contribute to backend services.
Top Skills: Azure OpenaiBleBlocDartFlutterGeminiGoGo_RouterMicroservicesMongodb AtlasNfcPythonRest ApisRxdartWifi
13 Days Ago
Easy Apply
Remote
Easy Apply
Senior level
Senior level
Artificial Intelligence • Hardware • Internet of Things • Machine Learning • Software • Manufacturing
The Algorithm and Applied AI Scientist (DSP) will lead projects focused on signal processing and data analytics, develop predictive models, and collaborate with product teams to provide data-driven solutions for industrial manufacturing.
Top Skills: Deep LearningDspGenerative AiPythonTgnnTransformersTsfm

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account