Fabrion Jobs

Data Engineer (Founding Team)

Fabrion

Data Engineer (Founding Team)

Reposted 20 Days Ago

In-Office or Remote

6 Locations

Senior level

In-Office or Remote

6 Locations

Senior level

Build and operate scalable data ingestion, transformation, and connector frameworks; design and maintain a knowledge-graph-based data fabric; normalize and vectorize enterprise data for LLM/AI workflows; implement governance, lineage, access controls, and secure APIs to serve ML/agent pipelines.

The summary above was generated by AI

Data/ETL Engineer (Founding Team)

Location: San Francisco Bay Area

Type: Full-Time

Compensation: Competitive salary + early-stage equity

Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.

About the Role

We’re building a multi-tenant, AI-native platform where enterprise data becomes actionable through semantic enrichment, intelligent agents, and governed interoperability. At the heart of this architecture lies our Data Fabric — an intelligent, governed layer that turns fragmented and siloed data into a connected ontology ready for model training, vector search, and insight-to-action workflows.

We're looking for engineers who enjoy hard data problems at scale: messy unstructured data, schema drift, multi-source joins, security models, and AI-ready semantic enrichment. You’ll build the backend systems, data pipelines, connector frameworks, and graph-based knowledge models that fuel agentic applications.

If you've worked on streaming unstructured pipelines, built connectors into ugly legacy systems, or mapped knowledge graphs that scale — this role will feel like home.

Responsibilities

Build highly reliable, scalable data ingestion and transformation pipelines across structured, semi-structured, and unstructured data sources
Develop and maintain a connector framework for ingesting from enterprise systems (ERPs, PLMs, CRMs, legacy data stores, email, Excel, docs, etc.)
Design and maintain the data fabric layer — including a knowledge graph (Neo4j or Puppygraph) enriched with ontologies, metadata, and relationships
Normalize and vectorize data for downstream AI/LLM workflows — enabling retrieval-augmented generation (RAG), summarization, and alerting
Create and manage data contracts, access layers, lineage, and governance mechanisms
Build and expose secure APIs for downstream services, agents, and users to query enriched semantic data
Collaborate with ML/LLM teams to feed high-quality enterprise data into model training and tuning pipelines

What We’re Looking For

Core Experience:

5+ years building large-scale data infrastructure in production environments
Deep experience with ingestion frameworks (Kafka, Airbyte, Meltano, Fivetran) and data pipeline orchestration (Airflow, Dagster, Prefect)
Comfortable processing unstructured data formats: PDFs, Excel, emails, logs, CSVs, web APIs
Experience working with columnar stores, object storage, and lakehouse formats (Iceberg, Delta, Parquet)
Strong background in knowledge graphs or semantic modeling (e.g. Neo4j, RDF, Gremlin, Puppygraph)
Familiarity with GraphQL, RESTful APIs, and designing developer-friendly data access layers
Experience implementing data governance: RBAC, ABAC, data contracts, lineage, data quality checks

Mindset & Culture Fit:

You’re a system thinker: you want to model the real world, not just process it
Comfortable navigating ambiguous data models and building from scratch
Passionate about enabling AI systems with real-world, messy enterprise data
Pragmatic about scalability, observability, and schema evolution
Value autonomy, high trust, and meaningful ownership over infrastructure

Bonus Skills

Prior work with vector DBs (e.g. Weaviate, Qdrant, Pinecone) and embedding pipelines
Experience building or contributing to enterprise connector ecosystems
Knowledge of ontology versioning, graph diffing, or semantic schema alignment
Familiarity with data fabric patterns (e.g. Palantir Ontology, Linked Data, W3C standards)
Familiar with fine-tuning LLMs or enabling RAG pipelines using enterprise knowledge
Experience enforcing data access policy with tools like OPA, Keycloak, Snowflake row-level security

Why This Role Matters

Agents are only as smart as the data they operate on. This role builds the foundation — the semantic, governed, connected substrate — that makes autonomous decision-making and agent action possible. From factory ERP records to geopolitical news alerts, the data fabric unifies it all.

If you're excited to tame complexity, unify chaos, and power intelligent systems with trusted data — we’d love to hear from you.

Similar Jobs

Prolific

Human Data Quality Engineer (Founding Team)

5 Days Ago

Remote

USA

Senior level

Artificial Intelligence

Design and own end-to-end data quality systems for managed AI data studies: define rubrics, sampling plans, automated checks, launch gates, drift detection, dashboards, and calibration. Investigate integrity issues, run root-cause analysis, build automation in Python/SQL, and partner with Product, Engineering, and Operations to embed quality controls, train reviewers, and scale repeatable quality playbooks across programs.

Top Skills: PythonSQL

1Password

Senior Engineer

11 Hours Ago

Remote

153K-214K Annually

Senior level

153K-214K Annually

Senior level

Cybersecurity

The Senior Developer will enhance self-service experiences, drive product growth through experimentation, and collaborate with cross-functional teams for measurable outcomes.

Top Skills: AmplitudeGoLookerNode.jsPythonQualtricsReactRubyTypescript

1Password

Senior .Net/C# Developer, SaaS Manager

11 Hours Ago

Remote

111K-172K Annually

Mid level

111K-172K Annually

Mid level

Cybersecurity

The role involves developing, testing, and maintaining software systems, collaborating with cross-functional teams, and mentoring junior developers while ensuring high code quality.

Top Skills: .NetC#

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine