About Periodic Labs
We are an AI + physical sciences lab building state of the art models to make novel scientific discoveries. We are well funded and growing rapidly. Team members are owners who identity and solve problems without boundaries or bureaucracy. We eagerly learn new tools and new science to push forward our mission.
About the Role
You will lead, design, build, and operate large-scale compute clusters to power AI scientific research.
You will write software that orchestrates large GPU and CPU clusters, manages resource allocation and automates cluster lifecycle operations. You will work on bringup, operations and maintenance of all aspects of these clusters.
You will build tools and get directly involved in large scale frontier research experiments to make Periodic Labs the world's best AI + science lab for physicists, computational materials scientists, AI researchers, and engineers.
We’re looking for distributed systems engineers with experience in managing large-scale compute environments, high-performance clusters, or similar hyperscale infrastructure.
You might thrive in this role if you have experience with:
>=5,000 GPU clusters
Cluster scheduling and orchestration tools like k8s and slurm
Cloud environments such as GCP, AWS, or Azure
Observability and monitoring tools like DataDog, Prometheus, Grafana, or VictoriaMetrics
IaC tools like terraform and ansible
GitOps tools like Github CI and ArgoCD
Top Skills
Similar Jobs
What you need to know about the San Francisco Tech Scene
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

.png)

