Filevine Jobs

Senior Data Engineer

Filevine

Senior Data Engineer

Posted Yesterday

Remote

Hiring Remotely in United States

160K-190K Annually

Senior level

Remote

Hiring Remotely in United States

160K-190K Annually

Senior level

The Senior Data Engineer will design and operate data systems, optimize data pipelines, and enhance AI workflows for legal operations. Responsibilities include data modeling, implementing governance standards, collaborating with cross-functional teams, and mentoring junior engineers.

The summary above was generated by AI

Filevine is a Legal AI company delivering Legal Operating Intelligence for the future of legal work. Grounded in a singular system of truth, Filevine brings together data, documents, workflows, and teams into one unified platform — where modern legal work happens with clarity and consistency. Powered by LOIS, the Legal Operating Intelligence System, Filevine connects context across every matter to transform legal operations from reactive to proactive. LOIS reads, understands, and reasons across your data to surface insight, automate complexity, and give professionals the clarity and confidence to see more, know more, and do more. Fueled by a team of exceptional collaborators and innovators, Filevine's rapid growth has earned AI awards and recognition from Deloitte and Inc. as one of the most innovative and fastest-growing technology companies in the country.

Role Summary

A Senior Data Engineer at Filevine is a hands-on individual contributor who designs, builds, and operates the data systems that power LOIS, our analytics products, and the agentic AI experiences our customers rely on. This role sits within the Data Engineering team and is focused on optimizing and extending Filevine's conversational self-service analytics solution — making natural-language access to legal operational data faster, smarter, and more reliable. You will partner closely with product, analytics, and AI engineering to turn raw legal and operational data into trusted, query-ready, agent-ready data products. We recognize and expect this role to spend up to 30% of time on non-coding activities including design, review, and cross-functional collaboration.

What You'll Do

- Optimize and improve Filevine's production usage of Snowflake and Cortex features — including warehouse management (usage, sizing, monitoring, etc.), clustering, query performance tuning, cost governance, and storage efficiency.
- Own and evolve our agentic data modeling and natural language data retrieval (text-to-sql) capabilities: build and curate semantic models, refine prompts, expand verified question libraries, and measure answer quality so that natural-language analytics get more accurate over time.
- Design and build batch and streaming data pipelines that ingest, transform, and model data from Filevine's product, CRM, billing, and telemetry systems into trusted, well-documented data products.
- Build the data foundations that power agentic AI workflows and LOIS — including feature pipelines, retrieval datasets, and low-latency serving paths for LLM-based reasoning over customer data.
- Establish reliability and governance standards including data quality checks, lineage, monitoring, incident response, access control, and PII handling consistent with our compliance posture.
- Partner with product and engineering stakeholders to define event contracts, model business concepts (matters, firms, users, billing) consistently, and reduce ambiguity across downstream consumers.
- Lead the evaluation and adoption of emerging tools across the modern data stack, recommending right-fit solutions that align with Filevine's strategic and security goals.
- Provide technical mentorship within the Data Engineering team, contribute to code reviews and design documents (DDs/ADRs), and help raise the bar on data engineering practice at Filevine.
- Participate in on-call rotations to maintain SLAs for production data pipelines and analytics surfaces.

What You'll Need

Required

- 5+ years of professional data engineering or backend engineering experience, with a proven track record of delivering production-grade data systems that drive measurable business outcomes.

- Significant hands-on experience operating a modern cloud data warehouse in production (e.g., Snowflake, BigQuery, Redshift, Databricks, Synapse, or equivalent) — including performance tuning, warehouse and cost management, role-based access control, and orchestration of warehouse-native compute (stored procedures, UDFs, streams/tasks, or equivalent).

- Demonstrated experience building with Agentic AI or LLM-powered systems in production — e.g., RAG pipelines, tool-using agents, MCP servers, warehouse-native LLM functions (such as Snowflake Cortex, BigQuery ML, or Databricks AI), or comparable frameworks.

- Expertise in advanced SQL and Python for building reliable, well-tested data pipelines and transformations.

- Experience with modern data modeling and transformation tooling such as dbt, including testing, documentation, and backward-compatible model design that supports self-service analytics.

- Experience with workflow orchestration (Airflow, Dagster, or similar) and cloud-native deployment on AWS, Azure, or GCP.

- Strong fundamentals in data modeling (dimensional, star/snowflake schemas), distributed systems, performance tuning, and data quality / observability principles.

- Professional experience with modern software development methodologies: Agile/Kanban, Git, CI/CD, and DevOps.

- Excellent written and verbal communication skills, with the ability to explain complex technical and data concepts to both technical and non-technical stakeholders.

- B.S., M.S., or Ph.D. in Computer Science, Information Systems, Engineering, or a related field — or equivalent professional experience

Nice to Have

- Hands-on Snowflake experience, including Snowpipe, streams/tasks, data sharing, and cost/governance tuning at scale.

- Experience with Snowflake Cortex Analyst specifically, including authoring and iterating on semantic models and verified queries.

- .NET / C# experience, or familiarity with reading and integrating against a .NET-based application backend.

- Experience using modern UI development tools, particularly Svelte or React

- Experience supporting machine learning workflows: feature stores, training datasets, or real-time scoring infrastructure.

- Experience in SaaS or product-led growth environments, including product analytics and revenue/usage telemetry.

- Infrastructure-as-code experience (Terraform), containerization (Docker, Kubernetes), and deployment (Octopus).

- Familiarity with the legal tech domain, document-heavy data, or working with unstructured data at scale.

- Track record of mentoring engineers and contributing to hiring and team-building.

What You Can Expect

- You will be a core builder of the data and AI foundations that LOIS and Filevine's product surfaces are built on.

- Your work will directly shape how legal professionals query, reason over, and act on their data — and will determine how fast, accurate, and trustworthy our agentic AI experiences become.

Cool Company Benefits:

- A dynamic, rapidly growing company, focused on helping organizations thrive

- Medical, Dental, & Vision Insurance (for full-time employees)

- Competitive & Fair Pay

- Maternity & paternity leave (for full-time employees)

- Short & long-term disability

- Opportunity to learn from a dedicated leadership team

- Top-of-the-line company swag

Privacy Policy Notice

Filevine will handle your personal information according to what’s outlined in our Privacy Policy.

Communication about this opportunity, or any open role at Filevine, will only come from representatives with email addresses using "filevine.com". Other addresses reaching out are not affiliated with Filevine and should not be responded to.

Similar Jobs

UL Solutions

Senior Data Engineer

Yesterday

Remote or Hybrid

160K-160K Annually

Senior level

160K-160K Annually

Senior level

Automotive • Professional Services • Software • Consulting • Energy • Chemical • Renewable Energy

The Senior Data Engineer develops ETL/ELT pipelines and data solutions for lab operations, focusing on collaboration and continuous improvement in data processing.

Top Skills: AirflowAWSAzureGitGitPandasPower BIPythonSparkSQL

Movement Labs

Senior Data Engineer

9 Days Ago

In-Office or Remote

90K-119K Annually

Senior level

90K-119K Annually

Senior level

Software

The Senior Data Engineer will design and implement data architecture, manage production data pipelines, lead projects, and ensure quality and documentation while supporting progressive politics through data-driven solutions.

Top Skills: AirflowBigQueryDagsterDbtGoogle Cloud PlatformPythonSQL

Kalderos

Senior Data Engineer

9 Days Ago

Easy Apply

Remote or Hybrid

Easy Apply

145K-170K Annually

Senior level

145K-170K Annually

Senior level

Big Data • Healthtech • Software • Analytics • Pharmaceutical • Infrastructure as a Service (IaaS)

As a Senior Data Engineer, you'll design scalable data models, build automated ETL pipelines, and mentor junior engineers while collaborating across teams.

Top Skills: AzureDbtDockerPostgresPythonSnowflakeSQLTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine