Zyphra Jobs

Research Engineer - AI Performance & Kernel Optimization

Zyphra

Research Engineer - AI Performance & Kernel Optimization

Reposted 7 Days Ago

In-Office

San Francisco, CA, USA

Mid level

In-Office

San Francisco, CA, USA

Mid level

As a Research Engineer, you will optimize AI performance, focusing on kernel development for large-scale ML workloads, profiling bottlenecks, and collaborating with teams to enhance model training and inference.

The summary above was generated by AI

Zyphra is an artificial intelligence company based in San Francisco, California.

The Role:

As a Research Engineer - AI Performance & Kernel Optimization, you will improve and optimize the performance of our large-scale language model training and inference stacks. You will work closely with our pretraining and inference teams to identify bottlenecks, design and implement highly optimized kernels, and push the limits of throughput, latency, and hardware utilization across a range of accelerator platforms. This role is suited for someone who enjoys deep systems work, cares about performance at every level of the stack, and is excited to translate low-level optimizations into meaningful gains for frontier-scale AI systems.

You’ll Work Across:

Kernel development and optimization for large-scale ML workloads, using any level of the stack from PTX/assembly to CUDA, HIP, Triton, or other GPU DSLs
Performance tuning for training and inference stacks across GPUs and other accelerators
Profiling and eliminating bottlenecks in memory movement, communication, scheduling, and compute utilization
Optimizing distributed training and inference systems for large MoE models, including large-scale model parallelism
Portability and optimization across non-NVIDIA hardware, with special interest in AMD hardware such as the MI300x and MI355x
Collaboration with research and infrastructure teams to turn systems improvements into real-world model training and inference gains

What We're Looking For / Requirements:

Strong engineering aptitude for building reliable, high-performance systems
Excellent low-level performance intuition and the ability to reason about hardware-software interactions
Are excited to rapidly learn new systems, tools, and hardware environments
Excellent communication and collaboration skills, with the ability to work effectively across research and engineering teams
Enjoy diving deep into the weeds and hunting down the last 10–20% of performance

Qualifications / Additional Skills:

Experience writing highly performant GPU kernels at any level of abstraction–PTX, CUDA, HIP, Triton, or other kernel DSLs
Experience optimizing ML workloads for large-scale training, ideally in language model pretraining or inference environments
Experience with non-NVIDIA accelerator hardware, such as AMD, AWS Trainium, Google TPU, Qualcomm, ARM, Intel, and custom ASICs
Strong understanding of distributed training systems and parallelism schemes, including data parallelism, tensor/model parallelism, pipeline parallelism, sharding, and communication/computation overlap
Experience with performance engineering in other demanding parallel computing environments such as HPC, quantitative finance, scientific computing, graphics, compilers, or numerical simulation
Strong systems intuition around memory hierarchy, bandwidth constraints, kernel fusion, launch overhead, communication overhead, and hardware utilization
Experience using profiling and debugging tools to drive performance improvements
Familiarity with infrastructure underlying large-scale training and inference, including collective communication libraries, and runtime performance analysis
Background in a highly technical field such as physics, mathematics, theoretical computer science, computer science, or electrical engineering
Any HPC experience is a strong plus

Why Work at Zyphra:

Our research methodology is grounded in methodical, step-by-step approaches to ambitious goals. Both deep research and engineering excellence are equally valued
We strongly value new and crazy ideas and are very willing to bet big on new ideas
We move as quickly as we can; we aim to minimize the bar to impact as low as possible
We all enjoy what we do and love discussing AI

Benefits and Perks:

Comprehensive medical, dental, vision, and FSA plans
Competitive compensation and 401(k) plan
Relocation and immigration support on a case-by-case basis
In-office snacks and meals provided
Unlimited PTO and company holidays
In-person team in San Francisco with a collaborative, high-energy environment

Palo Alto, California, United States, 94306

Similar Jobs

Enterprise Architect

39 Minutes Ago

Hybrid

South San Francisco, CA, USA

215K-228K Annually

Senior level

215K-228K Annually

Senior level

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting

The Enterprise Architect will co-lead tech advisory, manage client relationships, drive new business, mentor junior consultants, and oversee large transformations.

Top Skills: AICloudComputer ApplicationsDataDigitalOn-PremProject Management ToolsSecurity

Consultant

39 Minutes Ago

Hybrid

South San Francisco, CA, USA

160K-177K Annually

Senior level

160K-177K Annually

Senior level

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting

Consultants deliver solutions for clients, leveraging experience in R&D products, advising on strategic planning and enhancements, leading projects, and analyzing data.

Top Skills: CtmsEdcEhr/EsourceEtmfIqviaMedidataQmsRimSisVeeva Vault ClinicalVeeva Vault Regulatory

Strategic Alliances Specialist

39 Minutes Ago

Hybrid

South San Francisco, CA, USA

145K-158K Annually

Senior level

145K-158K Annually

Senior level

Artificial Intelligence • Healthtech • Professional Services • Analytics • Consulting

Develop and execute strategies for partnerships, manage revenue targets, collaborate with teams, and serve as an advisor for assigned partners.

Top Skills: Business DevelopmentPartnership ManagementSoftware

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Zyphra

Research Engineer - AI Performance & Kernel Optimization

Zyphra Palo Alto, California, USA Office

Similar Jobs

Enterprise Architect

Consultant

Strategic Alliances Specialist

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech