Lead architectural evolution for a multi-cloud platform, focusing on consolidation, cloud-native migration, observability strategies, and implementing security compliance.
Principal Infrastructure Architect – Cloud & SaaS Platforms San Jose, CA | Newport Beach, CA | Hybrid (2–3 days onsite)
Role Overview
We are seeking a highly experienced, hands-on Principal Infrastructure Architect to lead the architectural evolution of a large-scale, multi-cloud platform. This is a principal-level role — meaning the expectation goes well beyond senior: we are looking for someone who has been there and done it, who has walked into complex enterprise environments, assessed the landscape, and driven meaningful consolidation and modernization with both strategic vision and direct execution.
The core mandate is rationalizing an environment with too many overlapping systems — consolidating redundant observability platforms, streamlining security infrastructure, unifying log management, and building a data tokenization capability where none currently exists. This person will report into a VP-level architect and serve as a primary SME for infrastructure strategy.
What You'll Do
Platform Consolidation & Strategy
- Audit existing infrastructure to identify and eliminate redundant technology solutions (e.g., collapsing multiple observability platforms into one)
- Develop and drive data replication strategy and log infrastructure management across the enterprise
- Lead proof-of-concept work to evaluate and recommend replacement technologies
- Build out a net-new data tokenization solution
- Serve as SME for infrastructure strategy — planning AND execution; the doing is as important as the thinking
Cloud-Native Migration & Architecture
- Lead migration of legacy EC2-based workloads to Amazon EKS and Kubernetes; define standards for multi-region availability, auto-scaling, and spot instance orchestration
- Architect and deploy high-performance API Gateways and LLM Gateways to manage traffic for Generative AI workloads
- Implement Service Mesh (e.g., Istio, Linkerd) for advanced traffic splitting, mTLS, and observability
Real-Time Data & Event Infrastructure
- Design infrastructure for high-scale storage engines including AWS-native databases (DynamoDB, Aurora), OLAP systems, and real-time databases (Aerospike, Druid) supporting sub-millisecond latency
- Architect scalable Pub/Sub messaging systems (Kafka, SNS/SQS, Pulsar) to enable event-driven architectures at internet scale
Observability
- Define and implement a unified observability strategy based on OpenTelemetry (OTLP)
- Integrate platforms such as Grafana, Datadog, and Graylog for single-pane-of-glass visibility across logs, metrics, and traces
Identity, Security & Compliance
- Modernize authentication and authorization systems (OIDC, OAuth2, SPIFFE/SPIRE)
- Deploy and manage centralized Secret Stores (HashiCorp Vault, AWS Secrets Manager), security gateways, and automated certificate management
- Design Zero Trust network architectures; familiarity with SOC2 and PCI-DSS a plus
Infrastructure as Code & Technical Leadership
- Champion GitOps culture; enforce IaC best practices using Terraform, Crossplane, or Pulumi
- Mentor senior infrastructure engineers and drive Well-Architected reviews
- Partner cross-functionally with software teams to ensure infrastructure supports rapid product iteration
About You
Experience & Background
- 10+ years in infrastructure engineering and architecture, with a proven track record managing large-scale cloud deployments (AWS primary; GCP/Azure also valued)
- FAANG or Big Tech background is a strong differentiator
- Deep enterprise experience: you've walked into sprawling, complex environments, understood the strategic vision, and driven real change — not just recommendations
- Demonstrated long tenure at key roles; we are looking for people who go deep, not candidates with a string of 1–2 year stints
Technical Requirements
- Kubernetes Expert: deep hands-on mastery of Kubernetes and EKS, including custom controllers, operators, and VM-to-container migration for stateful and stateless workloads
- Database Reliability: strong experience architecting high-scale real-time stores (Aerospike, Redis) and analytics engines (Druid, ClickHouse)
- Observability: proven ability to build observability pipelines from scratch using OpenTelemetry collectors and visualization tools (Prometheus/Grafana/Datadog)
- Automation: expert-level proficiency in Go, Python, or Bash; deep CI/CD experience (GitLab CI, GitHub Actions, ArgoCD)
- Networking: deep understanding of cloud networking (VPC, Transit Gateways, Direct Connect) and protocols (gRPC, HTTP/2, WebSocket, QUIC)
Location & Work Model
Hybrid position based out of San Jose, CA or Newport Beach, CA — candidates must be within commutable distance of one of these two offices. Onsite presence of 2–3 days per week is expected. Candidates outside these locations will be considered only in exceptional cases and will face a significantly higher bar.
Compensation
Base salary is competitive and flexible, commensurate with experience. 25% target bonus. Comprehensive benefits including medical/dental/vision, 401(k), paid parental leave, and unlimited PTO.
Top Skills
Aerospike
Aurora
AWS
Aws Dynamodb
Bash
Crossplane
Datadog
Druid
Eks
Gitops
Go
Grafana
Graylog
Hashicorp Vault
Http/2
Istio
Kafka
Kubernetes
Linkerd
Opentelemetry
Pulumi
Python
Quic
Sns
Sqs
Terraform
Websocket
Similar Jobs
Information Technology
The role involves managing sales for Digital Experience solutions at CDW, enhancing market presence, and developing customer relationships while leading teams and onboarding new executives.
Top Skills:
Client VirtualizationDigital Experience SolutionsMicrosoft ServicesSecurity And Identity
Information Technology • Professional Services • Software • Energy
The Specialist, Accounts Receivable Operations is responsible for managing invoicing, collections, and cross-functional operations for client accounts, focusing on efficiency and cash flow management.
Top Skills:
AribaErp SystemsFieldglassNetSuiteOpeninvoiceSalesforceSigma DashboardsTableau
Information Technology • Professional Services • Software • Energy
The Lead, Accounts Receivable Operations manages invoicing processes, mentors team members, supports cross-functional collaboration, and drives continuous improvement initiatives.
Top Skills:
AribaCortexFieldglassGoogle SuiteExcelNetSuiteOpeninvoicePro-Unlimited/Wand
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine


