Software Engineer - Cloud Engineering Lead, Kubernetes

Sorry, this job was removed at 08:11 p.m. (PST) on Tuesday, Aug 05, 2025

Be an Early Applicant

Hybrid

Mountain View, CA, USA

145K-250K Annually

Hybrid

Mountain View, CA, USA

145K-250K Annually

Similar Jobs

Circle

Marketing Operations Manager

2 Hours Ago

In-Office or Remote

San Francisco, CA, USA

173K-223K Annually

Senior level

173K-223K Annually

Senior level

Blockchain • Fintech • Payments • Financial Services • Cryptocurrency • Web3

The Senior Marketing Operations Manager will oversee marketing automation, lead management, and segmentation while optimizing revenue systems and ensuring data integrity.

Top Skills: ClayCodexHubspotN8NSalesforce

DraftKings

Senior Lead Trading Strategist

3 Hours Ago

Remote or Hybrid

United States

184K-230K Annually

Senior level

184K-230K Annually

Senior level

Digital Media • Gaming • Information Technology • Software • Sports • Esports • Big Data Analytics

As a Senior Lead Trading Strategist, you'll design and develop trading strategies and systems, manage risk, and improve market-making through collaboration with engineers and data scientists, ensuring system scalability and performance.

Top Skills: C#C++JavaNumpyPandasPythonPyTorchRust

Milestone Systems

Sales Executive

3 Hours Ago

Remote or Hybrid

United States

155K-170K Annually

Senior level

155K-170K Annually

Senior level

Artificial Intelligence • Other • Security • Software • Analytics • Big Data Analytics

The Regional Sales Executive drives growth in mid-market enterprises, engaging end users and partners, managing complex sales cycles, and ensuring compliance with channel sales protocols.

Top Skills: Salesforce

The Cloud Infrastructure team at Kumo is responsible for managing and scaling our Kubernetes-based, cloud-native AI platform across multiple cloud providers. They set service level objectives, optimize resource allocation, enforce security compliance, and drive cost efficiency for the Multi-Cloud Platform.

As a key team member, you will architect and operate a highly scalable, resilient Kubernetes infrastructure to support massive Big Data and AI workloads. You’ll design and implement advanced cluster management strategies, fleet capacity scaling, optimize workload scheduling, and enhance observability at scale. Your expertise in Kubernetes internals, networking, and performance tuning will be critical in ensuring high availability and seamless scaling.

Joining early, you'll play a pivotal role in shaping platform reliability, automating infrastructure, and enabling ML engineers with efficient commit-to-production automation, Continuous Provisioning, CI/CD, ML Ops, and deployment orchestration and workflows. You'll collaborate with ML scientists, product engineers, and leadership to influence scaling strategies, develop self-service tooling, and drive multi-cloud resilience. Engineers at Kumo take ownership of core system design, building infrastructure that powers the next generation of AI applications.

Key Responsibilities

Design, build, and scale Kubernetes-based infrastructure to support Kumo’s multi-cloud AI platform, ensuring high availability, resilience, and performance.
Architect and optimize large-scale Kubernetes clusters, improving scheduling, networking (CNI), and workload orchestration for production environments.
Develop and extend Kubernetes controllers and operators to automate cluster management, lifecycle operations, and scaling strategies.
Enhance observability, diagnostics, and monitoring by building tools for real-time cluster health tracking, alerting, and performance tuning.
Lead efforts to automate fleet management, optimizing node pools, autoscaling, and multi-cluster deployments across AWS, GCP, and Azure.
Define and implement Kubernetes security policies, RBAC models, and best practices to ensure compliance and platform integrity.
Collaborate with ML engineers and platform teams to optimize Kubernetes for machine learning workloads, ensuring seamless resource allocation for AI/ML models.
Drive commit-to-production automation, cloud connectivity, and deployment orchestration, ensuring seamless application rollouts, zero-downtime upgrades, and global infrastructure reliability.

Required Skills and Experience

Kubernetes Mastery: 8-10+ years of experience managing large-scale Kubernetes clusters (EKS, GKE, AKS, or OpenSource) in production. Deep expertise in Kubernetes internals, including controllers, operators, scheduling, networking (CNI), and security policies.
Cloud-Native Infrastructure: 8-10+ years of experience building cloud-native Kubernetes-based infrastructure across AWS, Azure, and GCP.
Platform Engineering: 8-10+ years of experience building Kubernetes service meshes (Istio/Envoy, Traefik), networking policies (Calico/Tigera), and distributed ingress/egress control.
Fleet Management & Scaling: Proven experience in optimizing, scaling, and maintaining Kubernetes clusters across multi-cloud environments, ensuring high availability and performance.
Software Development: 8-10+ years of experience writing production-grade controllers and operators in Python, Go, or Rust to extend Kubernetes functionality.
Infrastructure-as-Code & Automation: Hands-on experience with Terraform, CloudFormation, Ansible, BASH and Make scripting to automate Kubernetes cluster provisioning and management.
Distributed Systems & SaaS: Expertise in building and operating large-scale distributed systems for cloud-native B2B SaaS applications running on Kubernetes.
Cloud Application Deployment: Deep expertise in building of container orchestration, workload scheduling, and runtime optimizations using Kubernetes, Argo or Flux.
Education: BS/MS in Computer Science or a related field (PhD preferred)

Nice to Have

Proficiency with cloud platforms such as AWS, GCP, or Azure.
Familiarity with chaos engineering tools and practices for testing system resilience.
Strong understanding of security best practices and compliance standards (GDPR, SOC2, ISO27001, vulnerability assessments, GRC, risk management).
Contributions to open-source projects, particularly in the Kubernetes or cloud-native ecosystem.
Expertise in Docker, Kubernetes, Jenkins, Flux, Argo, and Terraform in a Linux environment.
Hands-on experience with monitoring and observability tools such as Prometheus and Grafana.
Ability to develop customer-facing web frontends or public APIs/SDKs for platform services.

Benefits

Competitive salary and equity options.
Comprehensive medical and dental insurance.
An inclusive, diverse work environment where all employees are valued and supported.

We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, color, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.

357 Castro St, Suite 200, Mountain View, CA, United States, 94041

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Kumo

Software Engineer - Cloud Engineering Lead, Kubernetes

Similar Jobs

Marketing Operations Manager

Senior Lead Trading Strategist

Sales Executive

Kumo Mountain View, California, USA Office

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech