IntelliPro Group Inc. Logo

IntelliPro Group Inc.

Machine Learning Engineer, Training Infrastructure

Reposted Yesterday
In-Office
San Francisco, CA, USA
150K-250K Annually
Mid level
In-Office
San Francisco, CA, USA
150K-250K Annually
Mid level
Manage and optimize computational infrastructure for training ML models, ensuring scalability for large datasets and performance optimization.
The summary above was generated by AI
Job Title: Machine Learning Engineer, Training Infrastructure
Position Type: Full time
Location: San Francisco, CA, USA
Salary Range: $150,000 - $250, 000 (USD)
Job ID#: 158135
Job Description:

We are looking for an ML Engineer with 3+ YOE in high-performance computing systems to manage and optimize our computational infrastructure for training and deploying our machine learning models. The ideal candidate has diverse experience managing ML workloads at scale, supporting our 3DVAE and video diffusion models. We encourage you to apply even if you don't meet every requirement — we value curiosity, creativity, and the drive to solve hard problems.

Responsibilities
  • Design, implement, and maintain scalable computing solutions for training and deploying ML models, ensuring infrastructure can handle large video datasets.

  • Manage and optimize the performance of our computing clusters or cloud instances, such as AWS or Google Cloud, to support distributed training.

  • Ensure that our infrastructure can handle the resource-intensive tasks associated with training large generative models.

  • Monitor system performance and implement improvements to maximize efficiency and utilization, using tools like Airflow for orchestration.

  • Collaborate across research teams to understand their computational needs and provide appropriate solutions, facilitating seamless model deployment.

Requirements:
  • Bachelor’s degree in Computer Science, Information Technology, or a related field, with a focus on system administration.

  • Experience with cloud computing platforms such as Amazon Web Services, Google Cloud, or Microsoft Azure, essential for managing large-scale ML workloads.

  • This role is vital for ensuring the computational backbone supports the company’s ML efforts, focusing on deployment and scalability.

  • Values engineering processes and version control (CI/CD).

  • Knowledge of containerization technologies like Docker and Kubernetes required for deployments at scale.

  • Understanding of distributed training techniques and how to scale models across multi-node clusters aligning with video generation needs.

  • Strong problem-solving and communication skills, given the need to collaborate with diverse teams.

About Us:
Founded in 2009, IntelliPro is a global leader in talent acquisition and HR solutions. Our commitment to delivering unparalleled service to clients, fostering employee growth, and building enduring partnerships sets us apart. We continue leading global talent solutions with a dynamic presence in over 160 countries, including the USA, China, Canada, Singapore, Japan, Philippines, UK, India, Netherlands, and the EU.
IntelliPro, a global leader connecting individuals with rewarding employment opportunities, is dedicated to understanding your career aspirations. As an Equal Opportunity Employer, IntelliPro values diversity and does not discriminate based on race, color, religion, sex, sexual orientation, gender identity, national origin, age, genetic information, disability, or any other legally protected group status. Moreover, our Inclusivity Commitment emphasizes embracing candidates of all abilities and ensures that our hiring and interview processes accommodate the needs of all applicants. Learn more about our commitment to diversity and inclusivity at https://intelliprogroup.com/.
Compensation: The pay offered to a successful candidate will be determined by various factors, including education, work experience, location, job responsibilities, certifications, and more. Additionally, IntelliPro provides a comprehensive benefits package, all subject to eligibility.

Top Skills

Airflow
AWS
Docker
GCP
High-Performance Computing
Kubernetes
Machine Learning
Azure
HQ

IntelliPro Group Inc. Santa Clara, California, USA Office

3120 Scott Blvd, Ste 301, Santa Clara, CA, United States, 95054

Similar Jobs

28 Minutes Ago
In-Office
113K-149K Annually
Mid level
113K-149K Annually
Mid level
Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
Design and test ordnance systems, manage testing processes, analyze data, and prepare engineering documentation for munitions products.
Top Skills: CadNxSolidworks
29 Minutes Ago
In-Office
166K-220K Annually
Senior level
166K-220K Annually
Senior level
Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
The Systems Engineer will support the development of Omen, an autonomous air vehicle, by executing Agile Systems Engineering, managing design reviews, developing engineering documentation, and supporting testing and safety efforts.
Top Skills: AnsysCameoJamaJIRATeamcenter
29 Minutes Ago
In-Office
191K-253K Annually
Senior level
191K-253K Annually
Senior level
Aerospace • Artificial Intelligence • Hardware • Robotics • Security • Software • Defense
Lead Product Quality Engineering to drive product development excellence, establish KPIs/scorecards, ensure DCMA and regulatory compliance, oversee factory quality and productization, deploy FMEA/DFMEA and statistical methods, and build a high-performing quality organization that embeds design-for-quality and continuous improvement across R&D, manufacturing, and supply chain.
Top Skills: ApqpAs9100DcmaDesign For QualityDfmDfmeaDfssDoeFmeaGd&TIso9001MsaPpapReliability EngineeringRoot Cause Corrective ActionSix Sigma

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account