Fireworks AI Logo

Fireworks AI

AI Infrastructure Engineer

Sorry, this job was removed at 06:05 p.m. (PST) on Saturday, Apr 26, 2025
Be an Early Applicant
In-Office
Redwood City, CA, USA
In-Office
Redwood City, CA, USA

Similar Jobs

39 Minutes Ago
Hybrid
San Francisco, CA, USA
Mid level
Mid level
Payments • Sales • Software • Financial Services
Design and build fault-tolerant, distributed infrastructure for AI agents (memory, state, execution). Own architecture, deploy production services, create observability/eval tooling, collaborate on compliance workflows, improve performance and reliability, and mentor engineers.
Top Skills: Agent OrchestrationDistributed SystemsEval FrameworksLlmsObservability ToolingPrompt EngineeringRagState ManagementTool Calling
4 Days Ago
In-Office
Mountain View, CA, USA
70-79 Hourly
Mid level
70-79 Hourly
Mid level
Marketing Tech • Business Intelligence
Deploy, optimize, and integrate LLMs and multimodal models on local GPU/ARM64 hardware. Develop custom CUDA kernels, tune inference (TTFT, tokens/sec), connect backends to orchestration layers, build prototypes and frontends, and implement device communication protocols for local AI compute.
Top Skills: Arm64C++CudaCustom Cuda KernelsDockerFastapiGrpcKubernetesLitellmLlama.CppModel QuantizationNext.JsNvidia EcosystemOllamaOpenwebuiPythonReactRestRustTensorrt-LlmWebsockets
4 Days Ago
In-Office
Santa Clara, CA, USA
175K-296K Annually
Senior level
175K-296K Annually
Senior level
Automotive
Design and operate petabyte-scale end-to-end data pipelines for autonomous driving: onboard upload, cloud preprocessing, dataset production, and model training input. Build data cleaning, annotation QA, lineage, versioning, metadata management, and high-throughput distributed processing. Optimize I/O, memory, transmission, and containerized deployments; support cross-team collaboration and algorithm iteration.
Top Skills: Apache IcebergDockerGoJavaKafkaKubernetesLanceMongoDBMySQLPostgresPulsarPythonRabbitMQRedis


Job Duties:
Design core, backend software components. Interface with other teams to incorporate their innovations and vice versa. Conduct design and code reviews. Analyze and improve efficiency, scalability, and stability of various system resources. Design and implement the hardware and software infrastructure required for AI projects. Procure, configure, and manage servers, GPUs, TPUs, and other hardware resources. Set up cloud-based environments (e.g., AWS, Azure, GCP) for AI workloads. Deploy and manage distributed computing clusters (e.g., Kubernetes) for AI model training and inference. Optimize cluster performance and resource allocation for AI workloads. Monitor cluster health and troubleshoot issues as they arise. Architect and maintain data storage solutions (e.g., data lakes, databases) for AI datasets. Ensure data security, access controls, and data versioning. Implement data pipelines for efficient data ingestion and preprocessing. Develop and maintain automation scripts and tools for infrastructure provisioning and scaling. Implement continuous integration and continuous deployment (CI/CD) pipelines for AI models. Orchestrate workflows for training, evaluation, and deployment of AI models. Optimize infrastructure to handle large-scale AI workloads efficiently. Monitor and analyze system performance, making adjustments as needed. Implement load balancing and scaling strategies to meet demand. Implement security best practices to protect AI infrastructure and data. Stay up-to-date with security vulnerabilities and apply patches and updates. Ensure compliance with relevant data privacy and regulatory requirements. Collaborate with data scientists and AI engineers to understand their infrastructure needs. Provide technical support and troubleshooting assistance for AI infrastructure issues. Train and educate team members on best practices for using AI infrastructure.
Minimum Education & Experience Required:
Must have Bachelor’s degree or the equivalent in Computer Science, Computer Engineering or a related field, plus three (3) years of experience with ML infrastructure (PyTorch, Vertex AI, and Sagemaker) or related experience.


Minimum Skills Required:
Must have experience with: Experience with one or more search engine, recommendations, natural language processing, personalization, or similar applied ML domain. Experience with building, scaling, and optimizing distributed enterprise-grade Machine Learning systems. Experience with architectural patterns of large-scale software applications. Experience with publishing papers in machine learning and/or computer vision conferences and journals. Experience with large-scale machine learning techniques like semi-supervised learning, weakly-supervised learning, and online adaptation of ML models. Experience with publishing machine learning domains such as computer vision and natural language processing.

How to Apply:
Submit resume and apply online at http://www.fireworks.ai/careers and search for job by title.

HQ

Fireworks AI Redwood, California, USA Office

Redwood, CA, United States, 94063

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account