Phizenix Logo

Phizenix

ML Infrastructure Engineer

Reposted 14 Days Ago
Easy Apply
In-Office
Menlo Park, CA
180K-200K Annually
Senior level
Easy Apply
In-Office
Menlo Park, CA
180K-200K Annually
Senior level
The ML Infrastructure Engineer will design distributed systems for ML training, optimize inference, build automation pipelines, and monitor production performance.
The summary above was generated by AI

ML Infrastructure Engineer
Menlo Park, CA | On-Site | Full-Time/Direct Hire


Looking for ML Infra experts (Bay Area preferred) with deep experience in CUDA, GPU optimization, VLLMs, and LLM inference—pure language focus, no vision/audio.

Client Opportunity | Through Phizenix

Phizenix, a certified minority and women-led recruiting firm, is hiring on behalf of an AI startup pioneering diffusion-based large language models—built for faster generation, multimodal integration, and scalable enterprise deployment.

We’re looking for a ML Infrastructure Engineer to help build the infrastructure that powers large-scale model training and real-time inference. You’ll collaborate with world-class researchers and engineers to design high-performance, distributed systems that bring advanced LLMs into production.

Responsibilities
  • Design and manage distributed infrastructure for ML training at scale

  • Optimize model serving systems for low-latency inference

  • Build automated pipelines for data processing, model training, and deployment

  • Implement observability tools to monitor performance in production

  • Maximize resource utilization across GPU clusters and cloud environments

  • Translate research requirements into robust, scalable system designs

Must-Haves
  • Masters or PhD in Computer Science, Engineering, or a related field (or equivalent experience)

  • Strong foundation in software engineering, systems design, and distributed systems

  • Experience with cloud platforms (AWS, GCP, or Azure)

  • Proficient in Python and at least one systems-level language (C++/Rust/Go)

  • Hands-on experience with Docker, Kubernetes, and CI/CD workflows

  • Familiarity with ML frameworks like PyTorch or TensorFlow from a systems perspective

  • Understanding of GPU programming and high-performance infrastructure

Nice-to-Haves
  • Experience with large-scale ML training clusters and GPU orchestration

  • Knowledge of LLM-serving tools (vLLM, TensorRT, ONNX Runtime)

  • Experience with distributed training strategies (e.g., data/model/pipeline parallelism)

  • Familiarity with orchestration tools like Kubeflow or Airflow

  • Background in performance tuning, system profiling, and MLOps best practices

At Phizenix, we’re committed to supporting diverse and inclusive teams. This is your chance to shape the systems that power the next generation of AI innovation. Let’s build the future—together.

California Pay Range
$180,000$200,000 USD

Top Skills

Airflow
AWS
Azure
C++
Ci/Cd
Cuda
Docker
GCP
Go
Gpu Optimization
Kubeflow
Kubernetes
Llm Inference
Onnx Runtime
Python
PyTorch
Rust
TensorFlow
Tensorrt
Vllm
Vllms
HQ

Phizenix Livermore, California, USA Office

101 E. Vineyard Ave, Suite #119–115, Livermore, CA , United States, 94550

Similar Jobs

12 Hours Ago
In-Office
Mountain View, CA, USA
160K-241K Annually
Mid level
160K-241K Annually
Mid level
Artificial Intelligence • Automotive • Information Technology • Robotics
The role involves optimizing machine learning models, developing infrastructure for model life cycles, and collaborating across teams to enhance Nuro's autonomy technology.
Top Skills: C++CudaJaxKerasPythonPyTorchTensorFlowTriton
2 Days Ago
In-Office
Santa Clara, CA, USA
150K-250K Annually
Senior level
150K-250K Annually
Senior level
Artificial Intelligence • Machine Learning
As a Senior Site Reliability Engineer, you will manage HPC cluster operations, deploy infrastructure-as-code solutions, support research teams, and develop automation tools.
Top Skills: AnsibleAWSAzureBashCephGCPGitopsGpudirectInfinibandKubernetesLinuxPythonRdmaTerraform
14 Days Ago
In-Office
Redwood City, CA, USA
180K-270K Annually
Senior level
180K-270K Annually
Senior level
Robotics
The role involves designing and maintaining large-scale ML infrastructure, optimizing distributed training systems, and enhancing computing performance for model training.
Top Skills: AccelerateAWSDistributed SystemsGCPHigh-Performance ComputingKubernetesPyTorchTensorrtTriton

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account