Positron AI

Sr Software Engineer

Reposted 5 Days Ago

In-Office or Remote

2 Locations

Senior level

In-Office or Remote

2 Locations

Senior level

The Senior Software Engineer will develop high-performance software for executing open-source LLMs on custom hardware, focusing on optimizations and efficient libraries primarily in C++.

The summary above was generated by AI

About Us:

Positron.ai specializes in developing custom hardware systems to accelerate AI inference. These inference systems offer significant performance and efficiency gains over traditional GPU-based systems, delivering advantages in both performance per dollar and performance per watt. Positron exists to create the world's best AI inference systems.

Senior Software Engineer – Machine Learning Systems & High-Performance LLM Inference

We are seeking a Senior Software Engineer to contribute to the development of high-performance software that powers execution of open-source large language models (LLMs) on our custom appliance. This appliance leverages a combination of FPGAs and x86 CPUs to accelerate transformer-based models. The software stack is written primarily in modern C++ (C++17/20) and heavily relies on templates, SIMD optimizations, and efficient parallel computing techniques.

Key Areas of Focus & Responsibilities

Design and implement high-performance inference software for LLMs on custom hardware.
Develop and optimize C++-based libraries that efficiently utilize SIMD instructions, threading, and memory hierarchy.
Work closely with FPGA and systems engineers to ensure efficient data movement and computational offloading between x86 CPUs and FPGAs.
Optimize model execution via low-level optimizations, including vectorization, cache efficiency, and hardware-aware scheduling.
Contribute to performance profiling tools and methodologies to analyze execution bottlenecks at the instruction and data flow levels.
Apply NUMA-aware memory management techniques to optimize memory access patterns for large-scale inference workloads.
Implement ML system-level optimizations such as token streaming, KV cache optimizations, and efficient batching for transformer execution.
Collaborate with ML researchers and software engineers to integrate model quantization techniques, sparsity optimizations, and mixed-precision execution.
Ensure all code contributions include unit, performance, acceptance, and regression tests as part of a continuous integration-based development process.

Required Skills & Experience

7+ years of professional experience in C++ software development, with a focus on performance-critical applications.
Strong understanding of C++ templates and modern memory management.
Hands-on experience with SIMD programming (AVX-512, SSE, or equivalent) and intrinsics-based vectorization.
Experience in high-performance computing (HPC), numerical computing, or ML inference optimization.
Experience with ML model execution optimizations, including efficient tensor computations and memory access patterns.
Knowledge of multi-threading, NUMA architectures, and low-level CPU optimization.
Proficiency with systems-level software development, profiling tools (perfetto, VTune, Valgrind), and benchmarking.
Experience working with hardware accelerators (FPGAs, GPUs, or custom ASICs) and designing efficient software-hardware interfaces.

Preferred Skills (Nice to Have)

Familiarity with LLVM/Clang or GCC compiler optimizations.
Experience in LLM quantization, sparsity optimizations, and mixed-precision computation.
Knowledge of distributed inference techniques and networking optimizations.
Understanding of graph partitioning and execution scheduling for large-scale ML models.

Why Join Us?

Work on a cutting-edge ML inference platform that redefines performance and efficiency for LLMs.
Tackle challenging low-level performance engineering problems in AI and HPC.
Collaborate with a team of hardware, software, and ML experts building an industry-first product.
Opportunity to contribute to and shape the future of open-source AI inference software.

Top Skills

Avx-512

C++

Clang

Fpga

Gcc

Hpc

Llvm

Simd

Sse

Similar Jobs

Capital One

Senior Software Engineer

An Hour Ago

Remote or Hybrid

144K-165K Annually

Senior level

144K-165K Annually

Senior level

Fintech • Machine Learning • Payments • Software • Financial Services

Collaborate on Agile teams to develop cloud-based solutions using various programming languages and technologies, mentoring others and staying current on tech trends.

Top Skills: AWSAzureDockerGCPGoJavaKubernetesNode.jsNoSQLOpen Source RdbmsPythonScalaSQL

ServiceNow

Senior Software Engineer

13 Hours Ago

Remote or Hybrid

Santa Clara, CA, USA

141K-239K Annually

Senior level

141K-239K Annually

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

The role involves building clean, scalable software, collaborating on product requirements, integrating AI into workflows, and ensuring code quality through testing and reviews.

Top Skills: JavaScriptJunitSeleniumTestng

NBCUniversal

Senior Software Engineer

17 Hours Ago

Remote or Hybrid

New York, NY, USA

110K-150K Annually

Senior level

110K-150K Annually

Senior level

AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development

The Senior Software Engineer will lead Cloud Control Plane development for AWS, Azure, and GCP, including design, implementation, and maintenance. Responsibilities include building automation tools, collaborating with teams, enhancing cloud security and governance, and mentoring others.

Top Skills: Automation ToolsAWSAzureCftCi/CdEc2GCPLambdaRdsS3Terraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine