Kiddom’s Content & AI Systems team is building the data layer that powers the next generation of AI-assisted curriculum authoring and content delivery. This role sits at the intersection of data engineering and content systems — owning the pipelines, schemas, and validation frameworks that turn raw curriculum content into structured, AI-ready data products.
This is not a traditional data engineering role. Curriculum content is messy, inconsistent, and deeply domain-specific. You will work closely with Instructional Designers, AI engineers, and the Content Agents team to define data requirements, design schemas, and build the infrastructure that makes AI-powered authoring workflows possible.
You will...
- Design and own the schema and data models representing Kiddom’s curriculum content (lessons, activities, standards alignments) for downstream use
- Build ingestion pipelines that process content from varied, inconsistent source formats — XML, JSON, PDF-derived, and API-delivered
- Develop Python-based parsers, transformers, and validation scripts that enforce schema conformance and content quality at scale
- Collaborate directly with Instructional Designers and product teams to translate content authoring workflows into data engineering requirements
- Build and maintain embedding and vector database pipelines that feed Kiddom’s AI-powered content features as they scale
- Work in Git-based workflows — treating data artifacts with the same rigor as software: versioned, reviewed, and documented
What we're looking for...
- 4+ years of data engineering experience with strong Python skills — you’ve written parsers, validators, and transformation scripts for real-world messy data
- Schema design instincts — you think carefully about how data should be structured for downstream use, not just how to move it
- Data quality mindset — you build validation and completeness checks in from the start, not as an afterthought
- Cross-functional collaborator — comfortable working with non-engineers to define requirements and translate domain knowledge into data structures
- Provisioning and monitoring of infrastructure for data systems, familiarity with IaC tools such as Terraform and Terragrunt
- The data system operates, ECS, EKS clusters, provision lambdas and S3 buckets
- Bachelor’s or Master’s degree in CS or equivalent work experience
Bonus:
- Background in education, curriculum design, or ed-tech — understanding how instructional content is authored and structured is a genuine differentiator
- Experience with vector databases (Pinecone, Weaviate, pgvector) or embedding pipeline tooling
- Familiarity with agentic AI patterns or Model Context Protocol (MCP)
Kiddom San Francisco, California, USA Office
We are in Union Square, close to BART, MUNI, other public transportation, coffee shops, restaurants, bars, and shops. It's a bustling neighborhood in the heart of San Francisco with easy access to restaurants for offsite socializing at lunch or after work.
Similar Jobs
What you need to know about the San Francisco Tech Scene
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine


