Fabrion Logo

Fabrion

DevOps Engineer (Founding Team)

Posted Yesterday
In-Office or Remote
6 Locations
Senior level
In-Office or Remote
6 Locations
Senior level
Design, build, and operate secure, tenant-isolated cloud infrastructure and CI/CD for an AI-native multi-tenant platform. Implement observability, policy-as-code, SLAs, incident response, and automate provisioning (Terraform/Helm). Support deployment, monitoring, and secure operations for ML/agent systems at scale.
The summary above was generated by AI

DevOps Engineer (Founding Team)

Location: San Francisco Bay Area

Type: Full-Time

Compensation: Competitive salary + meaningful equity (founding tier)

Backed by 8VC, we're building a world-class team to tackle one of the industry’s most critical infrastructure problems.

About the Role

We're building an AI-native, multi-tenant enterprise platform for complex domains in industrial verticals. In this architecture, DevOps isn't just about shipping features — it's about operationalizing intelligent agents, ensuring traceability across AI systems, and supporting mission-critical ML infrastructure at scale.

We're looking for a DevOps engineer who can own infrastructure from Day 1 — automating everything from CI/CD and observability to cloud governance and security. You’ll work with a highly technical team building real-time AI pipelines and multi-agent systems. If you want to be the person who makes the platform run — fast, secure, reliable, and explainable — this is your role.

Responsibilities
  • Build and maintain scalable cloud infrastructure across AWS/GCP/Azure with a focus on secure, tenant-isolated deployments

  • Own and evolve CI/CD systems (e.g. GitHub Actions, ArgoCD) with progressive rollout, testing, and rollback flows

  • Establish observability tooling across services, agents, and pipelines (OpenTelemetry, Prometheus, Grafana, Sentry)

  • Implement policy-as-code (OPA, Rego) for deployment safety, RBAC, audit logging, and approval workflows

  • Define and enforce SLAs, uptime targets (99.99%+), incident response, and remediation workflows

  • Secure infrastructure: IAM, VPC, encryption, key management, image scanning, secrets rotation

  • Automate deployments, infrastructure provisioning (Terraform, Helm), and environment replication

What We’re Looking For

Core Experience:

  • 4–10+ years in DevOps, platform engineering, or SRE in production-grade systems

  • Strong experience with Docker, Kubernetes (EKS/GKE), Terraform or Pulumi

  • Hands-on experience deploying and monitoring distributed cloud-native systems

  • Familiar with GitOps practices, CI/CD design, progressive delivery, and secure SDLC

  • Clear understanding of how to implement monitoring, alerting, and failure simulation in dynamic environments

Engineering Mindset:

  • Obsessed with reliability, latency, uptime, and repeatability

  • Security-aware and compliance-conscious

  • Proactive — you don’t wait for alerts to fix things

  • Comfortable collaborating with backend, AI, and data teams

Bonus: Agent-Native / ML Ops Capabilities

  • We’re building an agentic, AI-native platform from the ground up. Experience here isn’t required, but would be a strong differentiator:

  • Experience running LLM orchestration frameworks (e.g. LangChain, LangGraph, Dust, ReAct agents)

  • Building retrieval-augmented generation (RAG) pipelines — and deploying them safely and repeatably

  • Familiarity with vector DBs (Weaviate, Qdrant, Pinecone) and embedding pipelines

  • Monitoring and governing long-running or multi-agent chains

  • Auditability and replay systems for agent decision-making

  • Serving fine-tuned or open-source LLMs with model versioning and GPU scaling (e.g. vLLM, TGI)

  • Interest in auto-remediation using agents (e.g. observability + alert → insight → response via LLM)

Why This Role Matters

DevOps is the nervous system of the platform — every agent, every data fabric component, every pipeline flows through what you build. This is a rare opportunity to design that system early, the right way, and future-proof it for scale, compliance, and trust.

If you're excited by intelligent systems, distributed data, and deeply technical infrastructure problems — and you want your work to have immediate real-world impact — we’d love to hear from you.

Similar Jobs

8 Days Ago
Remote
Senior level
Senior level
Blockchain • Cloud • Fintech • Payments • Software • Financial Services • Cryptocurrency
The Senior DevOps Engineer at Freedx will design and manage scalable infrastructure using modern tools, ensuring system reliability and performance in a fast-paced environment.
Top Skills: Argo CdAWSDockerGitopsHelmKubernetesMonitoring PlatformsObservabilityTerraform
9 Days Ago
Remote
Senior level
Senior level
Software
The Senior DevOps Engineer will manage production infrastructure, CI/CD pipelines, cloud platforms, and troubleshoot complex issues while supporting engineering teams and enhancing operational stability.
Top Skills: AnsibleAWSAzureBashChefDnsDockerElkGCPGitGitHttp/HttpsKubernetesLinuxOpensearchOpenshiftPodmanPythonSsl/TlsTcp/IpTerraformVMwareVpn
12 Days Ago
In-Office or Remote
United States
Senior level
Senior level
Information Technology
As a Senior DevOps Engineer, you'll build and manage cloud infrastructure, optimize CI/CD pipelines, collaborate with teams on compliance, and lead incident resolutions.
Top Skills: AWSBashCi/CdCloudwatchGitlabGrafanaKubernetesPythonTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account