As an Infrastructure Engineer, you will build and maintain AI platforms, enhance production reliability, and collaborate on architecture decisions.
Maxana is seeking an experienced Infrastructure Engineer for a confidential client — a fast-growing AI company. In this role you will build and maintain the platform layer supporting large-scale ML training, inference, and deployment. This is a high-impact role at the intersection of cloud infrastructure and ML systems.
Key Responsibilities
- Build and maintain infrastructure supporting large-scale ML training and inference workloads
- Work with GPU and compute infrastructure, distributed systems, and cloud-native platforms
- Improve reliability, observability, and performance across the platform layer
- Collaborate directly with senior engineers and product teams on architecture decisions
- Own production reliability — monitoring, incident response, and proactive risk reduction
- Develop and maintain internal tooling and automation to support engineering operations
Requirements
- 5+ years of infrastructure or platform engineering experience in a production environment
- Strong distributed systems background — experience with large-scale compute workloads preferred
- Cloud-native infrastructure experience — AWS, GCP, or Azure; Docker and Kubernetes required
- Familiarity with ML infrastructure a strong plus — training pipelines, inference serving, GPU workloads
- Experience owning production reliability end to end
Benefits
- Competitive base salary ($130,000-$240,000) + equity
- Medical, dental, and vision
- Flexible paid time off
- Learning and development stipend
- Working at the forefront of AI infrastructure at scale
Similar Jobs
Consumer Web • Healthtech • Professional Services • Social Impact • Software
Lead architecture and evolution of Headway's data platform (warehouse, ingestion, orchestration, CI/CD, monitoring, cloud infra). Serve as technical anchor across analytics, product, and ML teams, drive platform roadmaps, set standards, mentor engineers, and own end-to-end infrastructure decisions for scale and performance.
Top Skills:
AirflowAstronomerAWSAws CdkBigQueryDatabricksDatadogDbtDockerGithub ActionsNew RelicPulumiPythonRedshiftSnowflakeSparkSQLTerraform
Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity
Design, implement, and maintain scalable hybrid multi-cloud Kubernetes platforms at massive scale. Ensure high reliability, integrate open-source observability tools, provide technical direction, operate large Linux environments across cloud and data centers, handle on-call duties, and mentor junior engineers.
Top Skills:
AlertmanagerAWSGCPGoGrafanaKubernetesLinuxOciPrometheusThanos
Artificial Intelligence • Computer Vision • Machine Learning • Payments • Real Estate • PropTech
Senior Cloud Infrastructure Engineer responsible for designing, building, and operating central cloud infrastructure on AWS; managing observability (Datadog), version control and CI/CD (Git/GitHub); and collaborating closely with product and engineering teams. Role requires regular on-site presence (four days/week) and contributes to platform reliability and scalability.
Top Skills:
AWSDatadogGitGitGithub CopilotJavaMySQLPostgresReactScalaSnowflakeTypescript
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine



.jpg)