FriendliAI

United States
34 Total Employees
Year Founded: 2021

Jobs at FriendliAI

Let Your Resume Do The Work
Upload your resume to be matched with jobs you're a great fit for.

Recently posted jobs

2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design and optimize high-performance GPU kernels (GEMM, attention, routing) for AI inference across NVIDIA and AMD GPUs. Implement CUDA/C++ and low-level assembly code, build reduced-precision/quantized (FP8/FP4) kernels, benchmark cross-vendor performance, contribute to internal GPU libraries, accelerate multi-modal pipelines, and integrate next-generation GPU features into production.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Own quality for FriendliAI's full SaaS stack, including backend microservices, frontend, model deployments, and inference. Build pytest automated suites, Locust performance tests, Playwright end-to-end tests, and design strategies for validating LLM inference and model deployment workflows.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, implement, and optimize GPU kernels, kernel compiler, memory planner, and runtime for low-latency generative AI inference. Analyze performance bottlenecks across hardware and software, collaborate with infrastructure teams, and maintain production profiling, benchmarking, and validation tooling while supporting new model architectures and multi-GPU strategies.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, deploy, and operate large-scale LLM and multimodal inference architectures. Work hands-on with customer engineering teams to containerize, scale, monitor, and troubleshoot GPU-based inference workloads across Kubernetes, CI/CD, and hybrid/on-prem environments. Create Helm charts, Terraform modules, and observability tooling while delivering workshops and platform reliability insights.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Own and evolve core backend microservices for an AI inference platform: build production-grade APIs, multi-tenant SaaS features (auth, RBAC, billing), design OLTP/OLAP data models, collaborate on multi-cloud orchestration, ensure reliability/performance, and drive engineering quality through testing and CI/CD.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, build, and maintain agent APIs and production agent applications for document understanding, advanced RAG, and customer support automation. Integrate open-source models, collaborate with backend and infra for deployment and monitoring, and ensure APIs are robust, scalable, and developer-friendly.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, build, and maintain the Python SDK and cross-platform CLI, manage packaging and PyPI releases, develop internal DevOps developer tools, and produce examples, templates, and docs to improve developer experience while collaborating with product and frontend teams.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Lead end-to-end enterprise sales for FriendliAI's AI inference platform: generate pipeline, close high-value deals, run technical POCs, engage AI/ML communities, collaborate with engineering, and inform product roadmap.
2 Days AgoSaved
Hybrid
San Francisco, CA, USA
Artificial Intelligence • Cloud • Generative AI • Infrastructure as a Service (IaaS)
Design, build, and maintain a scalable web platform and APIs for deploying and monitoring multimodal AI models and agent workflows. Collaborate with product, infrastructure, and design teams to optimize performance, ensure reliability, drive CI/CD and testing, and contribute to long-term architecture decisions for a cloud-native, multi-tenant SaaS system.