Rockstar Logo

Rockstar

Backend Software Engineer (ML Infra)

Posted 8 Hours Ago
In-Office
San Francisco, CA, USA
Junior
In-Office
San Francisco, CA, USA
Junior
Build and scale backend systems and cloud-native infrastructure for large-scale ML workloads. Implement distributed training/inference pipelines, developer tools, and observability for GPU-heavy jobs while collaborating with ML engineers.
The summary above was generated by AI

Rockstar is recruiting for a fast-growing startup that is building the AI backbone for the next generation of intelligent products. They help fast-growing AI startups design, fine-tune, evaluate, deploy, and maintain specialized models across text, vision, and embeddings. Think of them as “AWS for AI models”—not data or raw compute, but a full-stack backend for fine-tuning, reinforcement learning, inference, and long-term model maintenance. Their customers are Series A–C AI companies building enterprise-grade products. Their promise is simple: they make your AI system better.

They are hiring a Backend Software Engineer (ML Infrastructure) to help design, build, and scale the core systems that power large-scale model training and deployment.

The candidate will work on distributed training pipelines, cloud-native infrastructure, and internal developer platforms that support fine-tuning, reinforcement learning, and inference at scale. This role sits at the intersection of backend engineering and ML systems—the candidate will collaborate closely with ML engineers while owning production-grade infrastructure.

This is an ideal role for an early-career engineer who wants to work on real distributed systems, GPU workloads, and modern ML infrastructure—not dashboards or CRUD apps.

What You’ll Do

Build & Scale Core Infrastructure

- Design and implement backend systems that support large-scale ML workloads, including fine-tuning and reinforcement learning.

- Build distributed training and inference pipelines that are efficient, fault-tolerant, and observable.

- Develop internal developer tools and platforms that make it easier for ML engineers to train, evaluate, and deploy models.

Cloud & Systems Engineering

- Work on cloud-native systems using containers and orchestration (e.g., Kubernetes).

- Optimize systems for performance, reliability, and cost efficiency, especially for GPU-heavy workloads.

- Implement monitoring, logging, and observability for long-running training jobs and production services.

Collaborate with ML Engineers

- Partner closely with ML engineers to support evolving model architectures, training workflows, and evaluation needs.

- Translate ML requirements into scalable backend and infrastructure solutions.

Who You Are

Required

- 1–3 years of backend engineering experience, ideally working on production systems.

- Strong fundamentals in distributed systems, networking, and backend architecture.

- Experience building systems that scale under real load.

- Comfortable working in Python and/or Go (or similar backend languages).

- Excited to work on-site in San Francisco with a fast-moving early-stage team.

Strongly Preferred

- Experience with or exposure to ML infrastructure or ML platforms.

- Familiarity with GPU workloads, training pipelines, or inference systems.

- Experience with containerization and orchestration (Docker, Kubernetes).

- Contributions to or deep familiarity with ML infrastructure libraries such as:

  - Ray

  - vLLM

  - SGLang

  - or similar distributed ML systems

Bonus

- Computer science background from a top-tier program or equivalent demonstrated excellence.

- Open-source contributions, research projects, or side projects in systems or ML infrastructure.

- A track record of high ownership and technical curiosity.

Similar Jobs

2 Hours Ago
In-Office
San Francisco, CA, USA
150K-182K Annually
Senior level
150K-182K Annually
Senior level
Artificial Intelligence • Cloud • Software • Infrastructure as a Service (IaaS)
The role involves guiding customers through Kubernetes migrations, optimizing cloud architectures, and collaborating with various teams to enhance customer success and product development.
Top Skills: AnsibleAWSAzureCloud InfrastructureGoogle Cloud PlatformHelmKubernetesLinuxTerraform
2 Hours Ago
Easy Apply
Remote or Hybrid
San Francisco, CA, USA
Easy Apply
107K-170K Annually
Senior level
107K-170K Annually
Senior level
Fintech • HR Tech
Lead end-to-end sales compensation administration for assigned segments, manage Xactly configuration, deliver monthly reporting and forecasts, support plan design and modeling, drive cross-functional initiatives, ensure compliance, and apply AI tools to automate processes and surface insights.
Top Skills: Ai ToolsIcm PlatformsLookerPower BISalesforceTableauXactly
2 Hours Ago
Hybrid
Sunnyvale, CA, USA
180K-180K Annually
Expert/Leader
180K-180K Annually
Expert/Leader
Artificial Intelligence • Natural Language Processing • Professional Services • Analytics • Consulting • Conversational AI • Generative AI
Lead large-scale, cross-functional life sciences programs across R&D, manufacturing, quality, and regulatory. Define roadmaps, budgets, governance; manage stakeholders, vendors, risks, and KPIs. Ensure GxP and regulatory compliance (FDA/EMA, 21 CFR Part 11, EU Annex 11), audit readiness, and adherence to SDLC/CSV/CSA for digital programs.
Top Skills: 21 Cfr Part 11CsaCsvEmaEu Annex 11FdaGCPGlpGmpGxpSdlc

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account