Zyphra Logo

Zyphra

Research Engineer - AI Performance & Kernel Optimization

Reposted 7 Days Ago
In-Office
San Francisco, CA, USA
Mid level
In-Office
San Francisco, CA, USA
Mid level
As a Research Engineer, you will optimize AI performance, focusing on kernel development for large-scale ML workloads, profiling bottlenecks, and collaborating with teams to enhance model training and inference.
The summary above was generated by AI
Zyphra is an artificial intelligence company based in San Francisco, California.

The Role:

As a Research Engineer - AI Performance & Kernel Optimization, you will improve and optimize the performance of our large-scale language model training and inference stacks. You will work closely with our pretraining and inference teams to identify bottlenecks, design and implement highly optimized kernels, and push the limits of throughput, latency, and hardware utilization across a range of accelerator platforms. This role is suited for someone who enjoys deep systems work, cares about performance at every level of the stack, and is excited to translate low-level optimizations into meaningful gains for frontier-scale AI systems.

You’ll Work Across:
  • Kernel development and optimization for large-scale ML workloads, using any level of the stack from PTX/assembly to CUDA, HIP, Triton, or other GPU DSLs

  • Performance tuning for training and inference stacks across GPUs and other accelerators

  • Profiling and eliminating bottlenecks in memory movement, communication, scheduling, and compute utilization

  • Optimizing distributed training and inference systems for large MoE models, including large-scale model parallelism

  • Portability and optimization across non-NVIDIA hardware, with special interest in AMD hardware such as the MI300x and MI355x

  • Collaboration with research and infrastructure teams to turn systems improvements into real-world model training and inference gains

What We're Looking For / Requirements:
  • Strong engineering aptitude for building reliable, high-performance systems

  • Excellent low-level performance intuition and the ability to reason about hardware-software interactions

  • Are excited to rapidly learn new systems, tools, and hardware environments

  • Excellent communication and collaboration skills, with the ability to work effectively across research and engineering teams

  • Enjoy diving deep into the weeds and hunting down the last 10–20% of performance

Qualifications / Additional Skills:
  • Experience writing highly performant GPU kernels at any level of abstraction–PTX, CUDA, HIP, Triton, or other kernel DSLs

  • Experience optimizing ML workloads for large-scale training, ideally in language model pretraining or inference environments

  • Experience with non-NVIDIA accelerator hardware, such as AMD, AWS Trainium, Google TPU, Qualcomm, ARM, Intel, and custom ASICs

  • Strong understanding of distributed training systems and parallelism schemes, including data parallelism, tensor/model parallelism, pipeline parallelism, sharding, and communication/computation overlap

  • Experience with performance engineering in other demanding parallel computing environments such as HPC, quantitative finance, scientific computing, graphics, compilers, or numerical simulation

  • Strong systems intuition around memory hierarchy, bandwidth constraints, kernel fusion, launch overhead, communication overhead, and hardware utilization

  • Experience using profiling and debugging tools to drive performance improvements

  • Familiarity with infrastructure underlying large-scale training and inference, including collective communication libraries, and runtime performance analysis

  • Background in a highly technical field such as physics, mathematics, theoretical computer science, computer science, or electrical engineering

  • Any HPC experience is a strong plus

Why Work at Zyphra:
  • Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued

  • We strongly value new and crazy ideas and are very willing to bet big on new ideas

  • We move as quickly as we can; we aim to minimize the bar to impact as low as possible

  • We all enjoy what we do and love discussing AI

Benefits and Perks:
  • Comprehensive medical, dental, vision, and FSA plans

  • Competitive compensation and 401(k) plan

  • Relocation and immigration support on a case-by-case basis

  • In-office snacks and meals provided

  • Unlimited PTO and company holidays

  • In-person team in San Francisco with a collaborative, high-energy environment

HQ

Zyphra Palo Alto, California, USA Office

Palo Alto, California, United States, 94306

Similar Jobs

39 Minutes Ago
Hybrid
South San Francisco, CA, USA
215K-228K Annually
Senior level
215K-228K Annually
Senior level
Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
The Enterprise Architect will co-lead tech advisory, manage client relationships, drive new business, mentor junior consultants, and oversee large transformations.
Top Skills: AICloudComputer ApplicationsDataDigitalOn-PremProject Management ToolsSecurity
39 Minutes Ago
Hybrid
South San Francisco, CA, USA
160K-177K Annually
Senior level
160K-177K Annually
Senior level
Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Consultants deliver solutions for clients, leveraging experience in R&D products, advising on strategic planning and enhancements, leading projects, and analyzing data.
Top Skills: CtmsEdcEhr/EsourceEtmfIqviaMedidataQmsRimSisVeeva Vault ClinicalVeeva Vault Regulatory
39 Minutes Ago
Hybrid
South San Francisco, CA, USA
145K-158K Annually
Senior level
145K-158K Annually
Senior level
Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting
Develop and execute strategies for partnerships, manage revenue targets, collaborate with teams, and serve as an advisor for assigned partners.
Top Skills: Business DevelopmentPartnership ManagementSoftware

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account