AppZen Logo

AppZen

Principal DevOps Engineer

Posted Yesterday
Be an Early Applicant
In-Office
San Jose, CA, USA
240K-280K Annually
Senior level
In-Office
San Jose, CA, USA
240K-280K Annually
Senior level
As Principal DevOps Engineer, you will lead technical direction, manage cloud infrastructure, optimize data stores, and mentor engineers while driving the adoption of best practices in engineering and security.
The summary above was generated by AI
AppZen is the leader in autonomous spend-to-pay software. Its patented artificial intelligence accurately and efficiently processes information from thousands of data sources so that organizations can better understand enterprise spend at scale to make smarter business decisions. It seamlessly integrates with existing accounts payable, expense, and card workflows to read, understand, and make real-time decisions based on your unique spend profile, leading to faster processing times and fewer instances of fraud or wasteful spend. Global enterprises, including one-third of the Fortune 500, use AppZen’s invoice, expense, and card transaction solutions to replace manual finance processes and accelerate the speed and agility of their businesses.
 
At AppZen, we value candidates who are actively using AI tools to enhance productivity, automate repetitive tasks, and solve problems more efficiently. Across all roles, we are looking for team members who leverage AI in meaningful ways to drive impact in their work.
 
To learn more, visit us at www.appzen.com.

As Principal DevOps Engineer you are the most senior individual contributor on the team. You set the technical direction, own the hardest infrastructure and reliability problems end-to-end, and lift the entire org through architecture, code, design reviews, and mentorship. You partner closely with the DevOps Manager and engineering leadership on roadmap and standards, but your scorecard is technical outcomes — not headcount.

Expect roughly 70-80% deep hands-on engineering (Terraform, Kubernetes, Postgres, Elasticsearch, pipelines, incident command) and 20-30% technical leadership: design reviews, mentorship, cross-team alignment, and writing the standards others build against.

What You'll Do:

    Set technical direction

  • Own the architecture for AppZen's cloud platform: AWS topology, Kubernetes design, datastore strategy, CI/CD, and observability — make the long-horizon calls and write the design docs the rest of engineering builds against.

  • Lead deep design reviews; set bar-raising standards for reliability, security, performance, and cost across infrastructure code and production systems.

  • Identify the highest-leverage platform investments (toil reduction, reliability, developer velocity) and drive them from idea to rollout.

  • Run Reliable, Secure Cloud Infrastructure

  • Drive AWS architecture and operations across multiple regions and accounts; own multi-account landing-zone, IAM, and network patterns.

  • Set the Terraform module and IaC patterns the team uses; lead the hardest migrations and cleanups personally.

  • Partner with Security on SOC 2, ISO 27001, GDPR, and customer audit requirements; design controls for IAM, network, and secrets management.

  • Drive cloud cost engineering: visibility, forecasting, and optimization (Savings Plans, rightsizing, multi-tenant efficiency).

  • Operate and Scale Critical Production Datastores

  • Be the team's go-to expert on PostgreSQL in production: schema and index strategy, query tuning, vacuum/bloat, replication, failover, point-in-time recovery, and major-version upgrades on RDS / Aurora.

  • Own scaling and reliability of Elasticsearch / OpenSearch: shard and index design, JVM/heap tuning, snapshot strategy, hot-warm tiers, and incident response under heavy ingest or query load.

  • Set patterns for supporting datastores: Redis (caching, queues), Kafka or SQS/SNS (streaming and async), and S3-backed data lakes — including HA, durability, and disaster recovery.

  • Lead capacity planning, performance benchmarking, data-tier cost optimization, backup/restore drills, and customer data isolation for multi-tenant workloads.

  • Evolve the Kubernetes and Container Platform

  • Own the architecture of our EKS-based Kubernetes platform: cluster lifecycle, autoscaling, multi-tenancy, and workload isolation.

  • Define the golden paths service teams use — Helm, Kustomize, and GitOps tooling such as ArgoCD or Flux — and personally build the trickiest pieces.

  • Set patterns for service mesh, ingress, and zero-downtime deployments.

  • Own CI/CD and the Developer Platform

  • Architect internal developer platform capabilities so product teams ship safely and quickly without infra friction.

  • Drive the design of build, test, and deploy pipelines (e.g., GitHub Actions, Jenkins, ArgoCD); enforce supply-chain security and artifact provenance.

  • Set the bar for DORA metrics: lead time, deploy frequency, change failure rate, and MTTR — and own the highest-impact improvements.

  • Drive observability and SRE practice

  • Architect the observability stack (e.g., Datadog, Prometheus, Grafana, OpenTelemetry); define metrics, logs, and tracing standards across services.

  • Define and operationalize SLOs and error budgets in partnership with service owners.

  • Act as incident commander for high-severity events; lead blameless post-mortems and convert learnings into durable systemic fixes.

  • Multiply the Team

  • Mentor senior and staff engineers; raise the bar through code and design reviews, pairing, and writing the references docs and run books others learn from.

  • Represent Cloud Engineering in cross-team forums; influence Product Engineering, Security, and Data on architecture and reliability decisions without authority.

  • Help the DevOps Manager hire — calibrate technical bar, design interview loops, and close senior candidates.

What You Bring:

  • 10+ years of experience in DevOps, SRE, infrastructure, or platform engineering, with at least 3 years operating at a Staff or Principal level (or equivalent technical leadership scope).

  • Deep, hands-on AWS expertise across compute, networking, IAM, data, and observability services; demonstrated ownership of multi-account, multi-region SaaS architectures.

  • Strong production experience with Kubernetes (preferably EKS), including upgrades, autoscaling, and securing multi-tenant clusters.

  • Demonstrated hands-on operations experience with PostgreSQL at scale — query and index tuning, replication, HA/failover, backups, and version upgrades — and with Elasticsearch / OpenSearch (cluster sizing, shard strategy, ingest tuning, and incident response).

  • Working knowledge of additional datastores commonly used in SaaS: Redis, Kafka or other message brokers, and object storage; comfortable evaluating trade-offs between managed services (RDS, Aurora, ElastiCache, MSK, OpenSearch Service) and self-managed options.

  • Expert with Terraform and modern IaC patterns; clear opinions on module design, state management, and PR-driven workflows.

  • Strong scripting and automation skills in at least one of Python, Go, or Bash; comfortable contributing real code, not just reviewing.

  • Track record of designing and operating CI/CD pipelines at scale (GitHub Actions, Jenkins, ArgoCD, or similar).

  • Experience running production workloads under SOC 2 or comparable compliance frameworks; comfortable partnering with Security on audits and remediation.

  • Demonstrated technical leadership without formal authority: writing decision-grade design docs, mentoring engineers, and influencing across teams. You enjoy lifting others through your work.

Nice To Have:


  • Experience supporting AI/ML or data-heavy SaaS workloads (GPU fleets, vector stores, large async pipelines).

  • Familiarity with service mesh (Istio, Linkerd) and progressive delivery (Argo Rollouts, feature flags).

  • Background scaling FinOps practices and managing cloud spend at $5M+ annual run-rate.

  • Experience operating multi-tenant SaaS with strict data isolation requirements for enterprise finance customers.

  • Exposure to multi-cloud or hybrid-cloud environments (Azure, GCP).

  • Open-source contributions, conference talks, or internal tech-leadership artifacts (eng wikis, RFCs, paved-road frameworks).

AppZen is committed to fair and equitable compensation practices.

The base pay range for this role is posted above. Actual compensation packages are based on several factors that are unique to each candidate, including but not limited to skill set, depth of experience, certifications, and specific work location. This may be different in other locations due to differences in the cost of labor.

The total compensation package for this position may also include annual performance bonus, stock, benefits and/or other applicable incentive compensation plans.


We are an equal opportunity employer and value diversity. All employment is decided on the basis of qualifications, merit and business need.
 
You can find our Privacy Notice linked on the bottom of our appzen.com website.

HQ

AppZen San Jose, California, USA Office

6201 America Center Dr, San Jose, CA, United States, 95002

Similar Jobs

14 Days Ago
Easy Apply
Remote or Hybrid
United States
Easy Apply
180K-210K Annually
Expert/Leader
180K-210K Annually
Expert/Leader
AdTech • Artificial Intelligence • Marketing Tech • Software • Analytics
The Principal DevOps Engineer will lead the design and operation of CI/CD pipelines, platform reliability, and compliance for software deployment, while influencing DevOps practices and simplifying operational management across teams.
Top Skills: Apache KafkaAWSDockerDynamoDBGitlab Ci/CdGrafanaKubernetesPrometheusSqsTerraform
6 Days Ago
In-Office
San Jose, CA, USA
145K-165K Annually
Senior level
145K-165K Annually
Senior level
Healthtech • Biotech
As a Senior DevOps Engineer, you will build and maintain DevOps capabilities, collaborate with teams for product delivery, and continually improve the infrastructure.
Top Skills: Active DirectoryAzureAzure Automation DscDockerGitJenkinsPuppetSvnVmware Vcenter
19 Days Ago
In-Office
San Francisco, CA, USA
172K-250K Annually
Senior level
172K-250K Annually
Senior level
Fintech
Leads design, development, and maintenance of technical infrastructure for software delivery, focusing on cloud automation, CI/CD, and best practices in a hands-on expert capacity.
Top Skills: AnsibleAWSCi/CdDockerJavaKubernetesLinuxReactTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account