Genmo Logo

Genmo

GPU Performance Engineer

Reposted 2 Days Ago
Be an Early Applicant
In-Office
San Francisco, CA, USA
Senior level
In-Office
San Francisco, CA, USA
Senior level
Optimize GPU performance, debug issues, write custom kernels, and collaborate with ML engineers to enhance model serving efficiency.
The summary above was generated by AI

We are Genmo, a research lab dedicated to building open, state-of-the-art models for video generation towards unlocking the right brain of AGI. Join us in shaping the future of AI and pushing the boundaries of what's possible in video generation.

We're seeking a GPU Performance Engineer to squeeze every last FLOP from our H100 infrastructure and optimize our model serving stack to its absolute limits.
The Role
You'll be our performance optimization expert, using advanced profiling tools to identify bottlenecks and implementing solutions that achieve 5-10x speedups. From writing custom CUDA kernels to eliminating cold start latency, you'll ensure our infrastructure delivers world-class performance. This role is perfect for someone who gets excited about microsecond optimizations and pushing hardware to its theoretical limits.
Key Responsibilities

  • Profile and optimize GPU workloads using Nsight Systems, nvprof, and custom instrumentation

  • Write high-performance CUDA and Triton kernels for critical model operations

  • Optimize cold start latency from seconds to milliseconds for our serving infrastructure

  • Tune memory access patterns, kernel fusion, and GPU utilization

  • Collaborate with ML engineers to optimize model implementations

  • Debug performance issues across the full stack from application to hardware

  • Implement custom memory pooling and allocation strategies

  • Share optimization techniques and build performance culture across teams

Qualifications

  • Bachelor's or Master's degree in Computer Science, Electrical Engineering, or related field

  • 5+ years systems programming experience with 3+ years focused on GPU optimization

  • Expert proficiency with GPU profiling tools (Nsight Systems, nvprof)

  • Strong CUDA programming skills with production kernel development

  • Deep understanding of GPU architecture (memory hierarchy, SMs, warps)

  • Track record of achieving significant performance improvements (5-10x)

  • Experience with Python and C++ in production environments

We Value

  • Experience with Triton kernel development

  • Knowledge of CUTLASS or similar high-performance libraries

  • Background in ML-specific optimizations (attention, transformers)

  • RDMA/InfiniBand optimization experience

  • Contributions to GPU libraries or frameworks

  • Low-level debugging skills (PTX/SASS reading)

Genmo is an Equal Opportunity Employer. Candidates are evaluated without regard to age, race, color, religion, sex, disability, national origin, sexual orientation, veteran status, or any other characteristic protected by federal or state law. Genmo, Inc. is an E-Verify company and you may review the Notice of E-Verify Participation and the Right to Work posters in English and Spanish.

HQ

Genmo San Francisco, California, USA Office

2261 Market Street, San Francisco, CA, United States

Similar Jobs

18 Days Ago
In-Office
Sunnyvale, CA, USA
120K-160K Annually
Mid level
120K-160K Annually
Mid level
Cloud • Information Technology • Machine Learning
Design, implement, and maintain infrastructure and tools to validate GPU performance at scale. Develop performance tests, automation workflows, and Kubernetes controllers/operators, extend open-source tooling for metrics and observability, troubleshoot production systems, and participate in on-call rotation.
Top Skills: Ai/Ml InfrastructureGoGpu Performance TestingHpcKubernetesKubernetes Custom ControllersKubernetes OperatorsPython
4 Days Ago
In-Office
Santa Clara, CA, USA
142K-269K Annually
Senior level
142K-269K Annually
Senior level
Artificial Intelligence • Cloud • Information Technology • Software
Optimize Intel GPUs for best-in-class performance by analyzing workloads, improving 3D rendering and shaders, identifying HW/SW performance bottlenecks, and performing RTL/model-based debug alongside architects and developers.
Top Skills: 3D Graphics RenderingBenchmarksGpu ArchitectureModel-Based DebugModern Graphics ApisRtlShaders
11 Days Ago
In-Office or Remote
4 Locations
224K-431K Annually
Expert/Leader
224K-431K Annually
Expert/Leader
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
The role involves optimizing GPU performance for neural reconstruction workflows, analyzing bottlenecks, and collaborating with engineers to enhance training and rendering systems.
Top Skills: C++CudaNsight ComputeNsight SystemsNvtxPythonPyTorch

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account