TetraMem Jobs

Senior Machine Learning Engineer

TetraMem

Senior Machine Learning Engineer

Reposted 5 Days Ago

In-Office

San Jose, CA, USA

200K-280K Annually

Senior level

In-Office

San Jose, CA, USA

200K-280K Annually

Senior level

Develop and optimize lightweight machine learning models for edge AI applications, integrating into production systems and mentoring junior engineers.

The summary above was generated by AI

Responsibilities:

Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing.
Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions.
Work closely with hardware and software teams to integrate ML models into production systems.
Research and implement state-of-the-art ML techniques to enhance model efficiency, latency, and power consumption for embedded AI applications.
Improve inference efficiency and model compression techniques, including quantization, pruning, and knowledge distillation.
Collaborate with cross-functional teams to drive innovation and contribute to the overall system architecture.
Provide technical leadership and mentorship to junior engineers.
Publish research findings, present at conferences, and contribute to open-source projects when applicable.

Requirements:

5+ years of relevant industry experience (or a PhD) in Computer Science, Electrical Engineering, Machine Learning, or related fields.
Must have prior experience managing a team, serving in a Team Lead role, or demonstrating strong technical leadership and cross-functional coordination capabilities.
Strong hands-on experience in machine learning, with a focus on edge AI, on-device inference, and deploying lightweight models on resource-constrained devices.
Expertise in modern ML frameworks such as PyTorch, TensorFlow (including TensorFlow Lite), and JAX.
Proficiency in Python and C/C++, with practical experience in ML model optimization and production deployment.
Deep experience with model quantization (PTQ/QAT), pruning, knowledge distillation, sparsity, and other compression techniques for efficient edge inference.
Hands-on experience developing for or integrating with AI chip SDKs, neural accelerators (NPUs/DSPs), or hardware-specific toolchains (e.g., NVIDIA TensorRT, Qualcomm Neural Processing SDK, ARM Ethos, or similar).
Familiarity with edge inference runtimes (ONNX Runtime, ExecuTorch, TVM) and optimizing models for hardware constraints (latency, memory footprint, power consumption).

Experience in one or more of the following areas considered a strong plus:

Understanding of ML compiler and runtime design.
Experience working with tools such as Optimum, ONNX, TensorRT, TFLite/LiteRT, ncnn, or CoreML.
Familiarity with hardware acceleration techniques.
Experience in embedded system development.

Salary Range: $200,000 - $280,000 / year

TetraMem celebrates diversity and is committed to creating an inclusive environment for all employees. We are proud to be an Equal Opportunity Employer and welcome applicants from all backgrounds. Qualified candidates will receive consideration for employment without regard to race, color, religion, creed, sex, gender identity or expression, sexual orientation, national origin, ancestry, age, marital status, medical condition, disability, genetic information, military or veteran status, or any other characteristic protected by applicable federal, state, or local law.

TetraMem is committed to providing reasonable accommodations to qualified applicants with disabilities throughout the recruitment process. Applicants requiring accommodation may contact Human Resources for assistance.

To ensure a fair, consistent, and efficient hiring process, all candidates must apply through TetraMem’s official ClearCompany Applicant Tracking System (ATS). Applications submitted through the ATS allow our hiring team to evaluate candidates using a standardized process and ensure timely communication throughout the recruitment process. To promote equal consideration for all applicants, applications submitted outside of the ClearCompany ATS, including direct emails, LinkedIn messages, or unsolicited submissions to employees, may not be reviewed or considered.

We encourage all interested candidates to apply through the official TetraMem Careers page.

Newark, CA, United States

Similar Jobs

Agero

Senior Machine Learning Engineer

10 Days Ago

Easy Apply

Remote or Hybrid

Easy Apply

134K-181K Annually

Senior level

134K-181K Annually

Senior level

Automotive • Big Data • Insurance • Software • Transportation

Design, develop, and deploy machine learning models and pipelines to optimize operations. Lead full ML project lifecycle, ensure model evaluation/monitoring, collaborate cross-functionally, mentor junior engineers, and drive continuous improvement in ML applications and processes.

Top Skills: AirflowAws EcrAws S3Aws SagemakerCi/CdDvcNumpyPandasPythonRestful ApisScikit-LearnSQL

ServiceNow

Senior Machine Learning Engineer

11 Days Ago

Hybrid

Mountain View, CA, USA

161K-274K Annually

Senior level

161K-274K Annually

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

Design, build, and optimize scalable ML infrastructure for training, evaluating, and serving large language models. Automate ML workflows, improve LLM latency and production pipelines, and collaborate cross-functionally to deploy and monitor models at scale.

Top Skills: C++Distributed TrainingETLGoHuggingfaceInference PipelinesLlmsMicroservicesModel EvaluationModel MonitoringPythonPyTorchTensorrt-LlmVllm

ServiceNow

Senior Machine Learning Engineer

11 Days Ago

Hybrid

Mountain View, CA, USA

161K-274K Annually

Senior level

161K-274K Annually

Senior level

Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation

Build and productionize scalable ML infrastructure for training, evaluating, and serving large language models. Develop distributed training and inference pipelines, optimize LLM latency, create automation abstractions, and collaborate cross-functionally to scale ML systems and best practices.

Top Skills: C++GoHugging FacePythonPyTorchTensorrt-LlmVllm

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

TetraMem

Senior Machine Learning Engineer

TetraMem Newark, California, USA Office

Similar Jobs

Senior Machine Learning Engineer

Senior Machine Learning Engineer

Senior Machine Learning Engineer

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech