inference.net Logo

inference.net

Machine Learning Researcher

Reposted 11 Days Ago
Be an Early Applicant
In-Office
San Francisco, CA, USA
250K-350K Annually
Mid level
In-Office
San Francisco, CA, USA
250K-350K Annually
Mid level
As a Machine Learning Researcher, you will research and experiment with new architectures, optimize models, and validate findings through rigorous experiments, enhancing AI models for clients.
The summary above was generated by AI

Help us push the boundaries of what's possible in LLM post-training. If you love training models, exploring new architectures, running experiments, and turning research insights into products that ship, we'd love to meet you.

About Inference.net

Inference.net trains and hosts specialized language models for companies who want frontier-quality AI at a fraction of the cost. The models we train match GPT-5 accuracy but are smaller, faster, and up to 90% cheaper. Our platform handles everything end-to-end: distillation, training, evaluation, and planet-scale hosting.

We are a well-funded ten-person team of engineers who work in-person in downtown San Francisco on difficult, high-impact engineering problems. Everyone on the team has been writing code for over 10 years, and has founded and run their own software companies. We are high-agency, adaptable, and collaborative. We value creativity alongside technical prowess and humility. We work hard, and deeply enjoy the work that we do. Most of us are in the office 4 days a week in SF; hybrid works for Bay Area candidates.

About the Role

You will be responsible for conducting research into experimental models, training systems, and modalities to create novel products for our customers. Your work will span from exploring new architectures and learning methods to optimizing latency and efficiency, with the goal of delivering better models to customers.

Your north star is pushing the frontier of what's possible in LLM post-training. You'll explore new techniques, run rigorous experiments, and when something works, help bring it into production with the help of your teammates. This includes training models for customers and running evaluations as part of validating your research. This role reports directly to the founding team. You'll have the autonomy, a large compute budget / GPU reservation, and technical support to explore ambitious ideas and ship the ones that work.

Key Responsibilities

  • Research and experiment with new model architectures to improve quality, efficiency, or capability

  • Explore methods to decrease inference latency and improve serving efficiency

  • Run experiments with new learning methods, including novel approaches to SFT, RLHF, DPO, and other post-training techniques

  • Perform reinforcement learning research to improve model alignment and capability

  • Develop and improve our distillation pipeline for training high-quality models from frontier teachers

  • Train models for clients and run evaluations to validate research findings in production settings

  • Create robust benchmarks and evaluation frameworks that ensure custom models match or exceed frontier performance

  • Stay current with ML research and identify techniques that can improve our platform

  • Collaborate with applied engineers to bring successful research into production systems

  • Document findings and share knowledge with the team

Requirements

  • 3+ years of experience training AI models using PyTorch

  • Deep understanding of transformer architectures, attention mechanisms, and model internals

  • Hands-on experience with post-training LLMs using SFT, RLHF, DPO, or other alignment techniques

  • Experience with LLM-specific training frameworks (e.g., Hugging Face Transformers, DeepSpeed, Megatron, TRL, or similar)

  • Strong experimental methodology, including ability to design, run, and analyze rigorous experiments

  • Track record of implementing ideas from recent ML papers

  • Experience training on NVIDIA GPUs at scale

  • Strong foundation in ML fundamentals: optimization, loss functions, regularization, generalization

Nice-to-Have

  • Publications in ML venues

  • Experience with model distillation or knowledge transfer

  • Experience with LLM speed optimization techniques

  • Familiarity with vision encoders, multimodal models, or other modalities

  • Experience with distributed training and infrastructure at scale

  • Contributions to open-source ML projects

You don't need to tick every box. Curiosity and the ability to learn quickly matter more.

Compensation

We offer competitive compensation, equity in a high-growth startup, and comprehensive benefits. The base salary range for this role is $250,000 - $350,000, plus equity and benefits, depending on experience.

Equal Opportunity

Inference.net is an equal opportunity employer. We welcome applicants from all backgrounds and don't discriminate based on race, color, religion, gender, sexual orientation, national origin, genetics, disability, age, or veteran status.

If you're excited about pushing the boundaries of custom AI research, we'd love to hear from you. Please send your resume and GitHub to [email protected] and/or here on Ashby.

HQ

inference.net San Francisco, California, USA Office

San Francisco, California, United States, 94111

Similar Jobs

3 Days Ago
In-Office
Livermore, CA, USA
123K-123K Annually
Junior
123K-123K Annually
Junior
Information Technology • Security • Energy • Defense
Perform AI/ML research to develop, train, test, optimize, and deploy models for laser performance prediction, optimization, and anomaly detection on NIF. Collaborate with operations and engineering teams, validate ML against physics models and experiments, integrate tools into workflows, publish results, and support team adoption of scientific ML techniques.
Top Skills: Bayesian MethodsEnsemble MethodsGradient Boosted TreesHigh-Performance ComputingHoloviewsMatplotlibNeural NetworksNumpyPandasParallel ProcessingPlotlyPythonPyTorchScikit-LearnScipyTensorFlowXgboost
5 Days Ago
In-Office
San Francisco, CA, USA
Senior level
Senior level
Artificial Intelligence • Software • Consulting • Automation
Lead applied ML research on agent reliability, long-horizon planning, and retrieval over messy operational data. Set research agenda, run rigorous experiments, build prototypes that bridge to production, collaborate tightly with platform and engineering teams, and publish when appropriate.
Top Skills: Agent SystemsLong-Horizon PlanningMachine LearningPrototypingRetrieval
10 Days Ago
In-Office
Livermore, CA, USA
123K-123K Annually
Junior
123K-123K Annually
Junior
Information Technology • Security • Energy • Defense
Postdoctoral researcher will apply and develop AI/ML methods to integrate and analyze large-scale multiomic datasets (bulk, single-cell, spatial, multimodal), train foundation and self-supervised models, build interpretable predictive models of host-response and disease trajectories, collaborate in multidisciplinary teams, publish results, and present findings.
Top Skills: LinuxPythonPyTorchRTensorFlowUnix

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account