AppZen

Principal DevOps Engineer

Posted Yesterday

Be an Early Applicant

In-Office

San Jose, CA, USA

240K-280K Annually

Senior level

In-Office

San Jose, CA, USA

240K-280K Annually

Senior level

As Principal DevOps Engineer, you will lead technical direction, manage cloud infrastructure, optimize data stores, and mentor engineers while driving the adoption of best practices in engineering and security.

The summary above was generated by AI

AppZen is the leader in autonomous spend-to-pay software. Its patented artificial intelligence accurately and efficiently processes information from thousands of data sources so that organizations can better understand enterprise spend at scale to make smarter business decisions. It seamlessly integrates with existing accounts payable, expense, and card workflows to read, understand, and make real-time decisions based on your unique spend profile, leading to faster processing times and fewer instances of fraud or wasteful spend. Global enterprises, including one-third of the Fortune 500, use AppZen’s invoice, expense, and card transaction solutions to replace manual finance processes and accelerate the speed and agility of their businesses.

At AppZen, we value candidates who are actively using AI tools to enhance productivity, automate repetitive tasks, and solve problems more efficiently. Across all roles, we are looking for team members who leverage AI in meaningful ways to drive impact in their work.

To learn more, visit us at www.appzen.com.

As Principal DevOps Engineer you are the most senior individual contributor on the team. You set the technical direction, own the hardest infrastructure and reliability problems end-to-end, and lift the entire org through architecture, code, design reviews, and mentorship. You partner closely with the DevOps Manager and engineering leadership on roadmap and standards, but your scorecard is technical outcomes — not headcount.

Expect roughly 70-80% deep hands-on engineering (Terraform, Kubernetes, Postgres, Elasticsearch, pipelines, incident command) and 20-30% technical leadership: design reviews, mentorship, cross-team alignment, and writing the standards others build against.

What You'll Do:

Set technical direction

Own the architecture for AppZen's cloud platform: AWS topology, Kubernetes design, datastore strategy, CI/CD, and observability — make the long-horizon calls and write the design docs the rest of engineering builds against.

Lead deep design reviews; set bar-raising standards for reliability, security, performance, and cost across infrastructure code and production systems.

Identify the highest-leverage platform investments (toil reduction, reliability, developer velocity) and drive them from idea to rollout.

Run Reliable, Secure Cloud Infrastructure

Drive AWS architecture and operations across multiple regions and accounts; own multi-account landing-zone, IAM, and network patterns.

Set the Terraform module and IaC patterns the team uses; lead the hardest migrations and cleanups personally.

Partner with Security on SOC 2, ISO 27001, GDPR, and customer audit requirements; design controls for IAM, network, and secrets management.

Drive cloud cost engineering: visibility, forecasting, and optimization (Savings Plans, rightsizing, multi-tenant efficiency).

Operate and Scale Critical Production Datastores

Be the team's go-to expert on PostgreSQL in production: schema and index strategy, query tuning, vacuum/bloat, replication, failover, point-in-time recovery, and major-version upgrades on RDS / Aurora.

Own scaling and reliability of Elasticsearch / OpenSearch: shard and index design, JVM/heap tuning, snapshot strategy, hot-warm tiers, and incident response under heavy ingest or query load.

Set patterns for supporting datastores: Redis (caching, queues), Kafka or SQS/SNS (streaming and async), and S3-backed data lakes — including HA, durability, and disaster recovery.

Lead capacity planning, performance benchmarking, data-tier cost optimization, backup/restore drills, and customer data isolation for multi-tenant workloads.

Evolve the Kubernetes and Container Platform

Own the architecture of our EKS-based Kubernetes platform: cluster lifecycle, autoscaling, multi-tenancy, and workload isolation.

Define the golden paths service teams use — Helm, Kustomize, and GitOps tooling such as ArgoCD or Flux — and personally build the trickiest pieces.

Set patterns for service mesh, ingress, and zero-downtime deployments.

Own CI/CD and the Developer Platform

Architect internal developer platform capabilities so product teams ship safely and quickly without infra friction.

Drive the design of build, test, and deploy pipelines (e.g., GitHub Actions, Jenkins, ArgoCD); enforce supply-chain security and artifact provenance.

Set the bar for DORA metrics: lead time, deploy frequency, change failure rate, and MTTR — and own the highest-impact improvements.

Drive observability and SRE practice

Architect the observability stack (e.g., Datadog, Prometheus, Grafana, OpenTelemetry); define metrics, logs, and tracing standards across services.

Define and operationalize SLOs and error budgets in partnership with service owners.

Act as incident commander for high-severity events; lead blameless post-mortems and convert learnings into durable systemic fixes.

Multiply the Team

Mentor senior and staff engineers; raise the bar through code and design reviews, pairing, and writing the references docs and run books others learn from.

Represent Cloud Engineering in cross-team forums; influence Product Engineering, Security, and Data on architecture and reliability decisions without authority.

Help the DevOps Manager hire — calibrate technical bar, design interview loops, and close senior candidates.

What You Bring:

10+ years of experience in DevOps, SRE, infrastructure, or platform engineering, with at least 3 years operating at a Staff or Principal level (or equivalent technical leadership scope).

Deep, hands-on AWS expertise across compute, networking, IAM, data, and observability services; demonstrated ownership of multi-account, multi-region SaaS architectures.

Strong production experience with Kubernetes (preferably EKS), including upgrades, autoscaling, and securing multi-tenant clusters.

Demonstrated hands-on operations experience with PostgreSQL at scale — query and index tuning, replication, HA/failover, backups, and version upgrades — and with Elasticsearch / OpenSearch (cluster sizing, shard strategy, ingest tuning, and incident response).

Working knowledge of additional datastores commonly used in SaaS: Redis, Kafka or other message brokers, and object storage; comfortable evaluating trade-offs between managed services (RDS, Aurora, ElastiCache, MSK, OpenSearch Service) and self-managed options.

Expert with Terraform and modern IaC patterns; clear opinions on module design, state management, and PR-driven workflows.

Strong scripting and automation skills in at least one of Python, Go, or Bash; comfortable contributing real code, not just reviewing.

Track record of designing and operating CI/CD pipelines at scale (GitHub Actions, Jenkins, ArgoCD, or similar).

Experience running production workloads under SOC 2 or comparable compliance frameworks; comfortable partnering with Security on audits and remediation.

Demonstrated technical leadership without formal authority: writing decision-grade design docs, mentoring engineers, and influencing across teams. You enjoy lifting others through your work.

Nice To Have:

Experience supporting AI/ML or data-heavy SaaS workloads (GPU fleets, vector stores, large async pipelines).

Familiarity with service mesh (Istio, Linkerd) and progressive delivery (Argo Rollouts, feature flags).

Background scaling FinOps practices and managing cloud spend at $5M+ annual run-rate.

Experience operating multi-tenant SaaS with strict data isolation requirements for enterprise finance customers.

Exposure to multi-cloud or hybrid-cloud environments (Azure, GCP).

Open-source contributions, conference talks, or internal tech-leadership artifacts (eng wikis, RFCs, paved-road frameworks).

AppZen is committed to fair and equitable compensation practices.

The base pay range for this role is posted above. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location. This may be different in other locations due to differences in the cost of labor.

The total compensation package for this position may also include annual performance bonus, stock, benefits and/or other applicable incentive compensation plans.

We are an equal opportunity employer and value diversity. All employment is decided on the basis of qualifications, merit and business need.

You can find our Privacy Notice linked on the bottom of our appzen.com website.

6201 America Center Dr, San Jose, CA, United States, 95002

Similar Jobs

Zeta Global

Devops Engineer

14 Days Ago

Easy Apply

Remote or Hybrid

United States

Easy Apply

180K-210K Annually

Expert/Leader

180K-210K Annually

Expert/Leader

AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics

The Principal DevOps Engineer will lead the design and operation of CI/CD pipelines, platform reliability, and compliance for software deployment, while influencing DevOps practices and simplifying operational management across teams.

Top Skills: Apache KafkaAWSDockerDynamoDBGitlab Ci/CdGrafanaKubernetesPrometheusSqsTerraform

Elekta

Senior Devops Engineer

6 Days Ago

In-Office

San Jose, CA, USA

145K-165K Annually

Senior level

145K-165K Annually

Senior level

Healthtech • Biotech

As a Senior DevOps Engineer, you will build and maintain DevOps capabilities, collaborate with teams for product delivery, and continually improve the infrastructure.

Top Skills: Active DirectoryAzureAzure Automation DscDockerGitJenkinsPuppetSvnVmware Vcenter

Early Warning

Principal Engineer

19 Days Ago

In-Office

San Francisco, CA, USA

172K-250K Annually

Senior level

172K-250K Annually

Senior level

Fintech

Leads design, development, and maintenance of technical infrastructure for software delivery, focusing on cloud automation, CI/CD, and best practices in a hands-on expert capacity.

Top Skills: AnsibleAWSCi/CdDockerJavaKubernetesLinuxReactTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

AppZen

Principal DevOps Engineer

AppZen San Jose, California, USA Office

Similar Jobs

Devops Engineer

Senior Devops Engineer

Principal Engineer

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech