ComfyUI Logo

ComfyUI

Senior/Staff AI Cloud Infra Engineer

Posted One Month Ago
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
The AI Cloud Infra Engineer will ensure reliability of back-end systems, manage infrastructure, and develop solutions for performance and availability at ComfyUI, focusing on cloud technology and automation.
The summary above was generated by AI
The Role

We are looking for a AI Cloud Infra Engineer to join our infrastructure team. This role will be responsible for ensuring the reliability of our back-end systems, working with engineers who develop them, and planning for our future growth. Our core infrastructure relies heavily on Kubernetes (K8s), Terraform, and GCP, but we care more about your ability to learn, adapt, and ship robust solutions than whether you've used these exact tools before.

You are a good fit if this describes you:

  • You possess a strong understanding of foundational cloud infrastructure (AWS/GCP/Azure) and Linux provisioning/management tools.

  • You know how to design for reliability and scale with minimal operational overhead.

  • You learn new technologies rapidly because you're excited by solving hard infrastructure challenges.

  • You've scaled infrastructure before and understand the tradeoffs that matter.

  • You think most infrastructure moves too slowly and could be way better automated and optimized.

  • You're comfortable diving into unfamiliar systems and making them work reliably.

  • You are a self-starter who executes quickly, takes ownership, and constantly seeks improvement.

What you'll do:

  • Develop and maintain our core Python platform for routing requests, orchestrating AI workloads, managing GPU server capacity, observability, and more.

  • Develop and maintain our infrastructure layer using Terraform and cloud provider APIs to manage our fleet of GPU workers across cloud and potentially bare metal environments.

  • Own and operate the technologies underpinning our platform, potentially including K8s, FluxCD, Nomad, Prometheus, Thanos, DataDog, Loki, distributed networking/storage, etc.

  • Architect and implement solutions that directly impact the performance and availability of services for millions of ComfyUI users.

  • Work closely with our core engineering team to design and build new infrastructure systems.

  • Help create the vision and lay the foundation for where our infrastructure should go in the next 1/2/5 years.

  • Help shape our technical direction and infrastructure best practices as we grow.

Requirements:

  • You have relevant experience as an AI Cloud Infra Engineer for a high tech startup.

  • Experience in participating in incident management processes.

  • Strong foundation and experience in managing cloud infrastructure (AWS, GCP, or Azure). Experience with bare metal is a plus.

  • Solid understanding of container orchestration (Kubernetes preferred) and CI/CD principles and tools.

  • Excellent communication skills.

  • Proven ability to learn fast and ship quality infrastructure code and configurations.

Nice to have:

  • You have excelled at a fast-paced, high-growth tech startup before or are extremely excited about being in one.

  • Experience specifically with GPU management, scheduling, and monitoring in a large-scale environment.

  • Experience with specific observability tools (DataDog)

What is ComfyUI?

ComfyUI is the world’s leading visual AI platform — an open, modular system where anyone can build, customize, and automate AI workflows with precision and full control.

Unlike most AI tools that hide their inner workings behind a simple prompt box, ComfyUI gives professionals the freedom to design their own pipelines — connecting models, tools, and logic visually like building blocks.

It’s used by artists, filmmakers, video game creators, designers, researchers, VFX houses, and among others, teams at OpenAI, Netflix, Amazon Studios, Ubisoft, EA, and Tencent — all who want to go beyond presets and truly shape how AI creates.

ComfyUI empowers those who were not trained with the power of the brush to also be a painter, and those who are, to be a maestro.

  • Built for users who value transparency and control Infinitely extensible — thousands of community-made nodes and integrations

  • Scales from creative experimentation to production automation

  • Open-source, used by millions, and backed by one of the most active AI communities online

  • Evolving to democratize visual AI creation: empowering everyone from hobbyists to studios, storytellers, and enterprises to be more productive and creative than ever before

ComfyUI isn’t just another AI app. It’s aiming to become the operating system for visual generative AI , the foundation on which the next generation of creative tools are being built.

An creative’s show case of how Comfy is adopted in their work

About Us

We are a small, intense, and well-funded team in San Francisco who push ComfyUI and its ecosystem forward. Our team comes from Stability AI and Google and many contributed to the ComfyUI ecosystem way before working here.

Our organization is flat and there is no hierarchy, only categories: dev, arts, prod, ops, etc (and no, there is no one here with the title of Member of Technical Staff, it’s long and silly for a job title).

The only thing that matters is the quality of your cultural fit and execution. We work hard and demand a lot of each other. But we have fun: everyone is here to make something meaningful that will end up being our life’s work. If this mission excites you and you view yourself as a top-tier talent, your future latent self is waiting for you at Comfy.

Check out our Github and blog for what we’ve been working on. Our investors include Pace Capital, Chemistry, Abstract Venture, and Guillermo Rauch.

Top Skills

AWS
Azure
Datadog
Fluxcd
GCP
Kubernetes
Loki
Nomad
Prometheus
Python
Terraform
Thanos
HQ

ComfyUI San Francisco, California, USA Office

San Francisco, CA, United States

Similar Jobs

13 Minutes Ago
In-Office or Remote
Headquarters, AZ, USA
86K-130K Annually
Junior
86K-130K Annually
Junior
Fintech • HR Tech • Payments • Social Impact • Financial Services
The Franchise Operations Manager drives revenue growth by overseeing franchise data, managing initiatives, analyzing lead generation, and collaborating cross-functionally. Responsibilities include building playbooks, managing performance tracking, and executing marketing initiatives.
Top Skills: AdobeCanvaSalesforce
14 Minutes Ago
Remote or Hybrid
Location, WV, USA
Senior level
Senior level
Cloud • Information Technology • Security • Software • Cybersecurity
As a Sr. SOC Analyst, you will monitor and analyze security incidents, respond to alerts, and conduct detailed threat analysis to protect digital assets.
Top Skills: AntivirusCC#EdrFirewallIds/IpsJavaPowershellPythonSIEM
14 Minutes Ago
Remote
USA
75K-86K Annually
Junior
75K-86K Annually
Junior
Computer Vision • Healthtech • Information Technology • Logistics • Machine Learning • Software • Manufacturing
The Quality and Regulatory Affairs Associate will support regulatory execution, maintain ISO compliance, and assist in post-production quality activities within a rapidly growing dental technology company.
Top Skills: Document ControlElectronic QmsEu MdrIso 13485MdsapUk Mhra

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account