Together AI Jobs

Technical Account Manager (TAM), GPU Cluster

Together AI

Technical Account Manager (TAM), GPU Cluster

Posted 8 Days Ago

Be an Early Applicant

In-Office

San Francisco, CA, USA

260K-290K Annually

Senior level

In-Office

San Francisco, CA, USA

260K-290K Annually

Senior level

Serve as the named technical owner for a strategic customer, managing end-to-end GPU infrastructure (compute, networking, storage, facilities). Lead incident lifecycle, RCAs, RMA coordination, observability, capacity expansions, and cross-functional escalation while driving technical roadmaps and executive communications.

The summary above was generated by AI

About the role

As a TAM at Together AI, you will serve as the named technical owner for one of our most strategic customer relationships. You will be the primary technical point of contact across all infrastructure domains — compute, networking, storage, and facilities — ensuring flawless delivery and operational health of large-scale GPU deployments. This role sits at the intersection of deep infrastructure expertise and high-stakes customer partnership, making you a critical driver of both customer success and company growth.

Responsibilities

Serve as the named technical point of contact for a dedicated strategic customer, owning the end-to-end technical relationship across compute, networking, storage, and facilities

Drive structured engagement through regular cadences including status reporting, technical steering meetings, and executive business reviews
Translate customer operational feedback into actionable input for Engineering, Product, and Infrastructure roadmaps

Lead issue lifecycle management, escalation, and RCA authorship across all infrastructure domains in partnership with Support, SRE, DC Ops, and Engineering teams
Own end-to-end RMA coordination and hardware lifecycle management, including acceptance testing, spare inventory management, and hardware health reporting for large-scale GPU deployments
Maintain deep technical expertise across the customer's infrastructure stack — GPU compute, high-speed fabric, and large-scale storage systems — advising on configuration, operational best practices, and incident resolution
Own the observability strategy for the customer estate, including alert policy definition, dashboard development, and proactive health management across all infrastructure layers
Coordinate DC operations and facilities events in partnership with internal teams and hosting providers, ensuring SLA compliance and cluster availability
Act as project manager for all capacity expansions, owning the full node deployment lifecycle from freight receipt through production acceptance

Qualifications

5+ years in a customer-facing technical role, with 2+ years in dedicated technical account management or solutions architecture for large-scale AI or HPC infrastructure
Deep expertise in GPU infrastructure — GPU health diagnostics, RMA workflows, and hardware acceptance testing
Hands-on experience with large-scale Ethernet and InfiniBand fabric architecture
Working knowledge of enterprise storage systems, including high-density NVMe, parallel file systems, and metadata infrastructure
Experience with DC operations, facilities coordination, and hosting provider SLA management
Strong ownership mindset for incident management, RCA authorship, and executive-level customer communication
Proficiency in infrastructure monitoring and observability tooling (Prometheus, Grafana, or equivalent)
Proven ability to manage multiple concurrent workstreams with hyperscaler-level rigor and communication standards
Proficiency in Python, Bash, or infrastructure automation tools preferred

About Together AI

Together AI is a research-driven artificial intelligence company. We believe open and transparent AI systems will drive innovation and create the best outcomes for society, and together we are on a mission to significantly lower the cost of modern AI systems by co-designing software, hardware, algorithms, and models. We have contributed to leading open-source research, models, and datasets to advance the frontier of AI, and our team has been behind technological advancements such as FlashAttention, Hyena, FlexGen, and RedPajama. We invite you to join a passionate group of researchers on our journey in building the next generation of AI infrastructure.

Compensation

We offer competitive compensation, startup equity, health insurance, and other benefits, as well as flexibility in terms of remote work. The US base salary range for this full-time position is: $260-290K OTE + equity + benefits. Our salary ranges are determined by location, level and role. Individual compensation will be determined by experience, skills, and job-related knowledge.

Location

San Francisco, CA (Hybrid) or New York, NY (Hybrid)

Equal Opportunity

Together AI is an Equal Opportunity Employer and is proud to offer equal employment opportunity to everyone regardless of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability, gender identity, veteran status, and more.

Please see our Privacy Policy at https://www.together.ai/privacy

584 Castro St, #2050, San Francisco, California , United States, 94114

Similar Jobs

Turion Space

Staff Software Engineer

An Hour Ago

In-Office

180K-231K Annually

Senior level

180K-231K Annually

Senior level

Aerospace • Artificial Intelligence • Hardware • Information Technology • Software • Defense • Manufacturing

Lead design and implementation of an AI-first enterprise platform (Hyperdrive) to automate aerospace operations. Build full-stack, data-dense React interfaces, scalable distributed systems, data pipelines, APIs, and AI integrations (LLMs, agents, RAG). Mentor engineers, set architecture, and collaborate with hardware, supply chain, and finance to turn operational bottlenecks into automated workflows.

Top Skills: Agentic WorkflowsAPIsAWSAzureCloud-NativeData LakeETLGoJavaScriptKubernetesLlmsMicroservicesPostgresPythonRagReactReal-Time ProcessingSnowflakeTypescript

Chime

Creative Director

An Hour Ago

Easy Apply

Hybrid

San Francisco, CA, USA

Easy Apply

201K-279K Annually

Expert/Leader

201K-279K Annually

Expert/Leader

Fintech • Machine Learning • Mobile • Security • Software

Lead creative strategy and execution for Growth and Product Marketing, managing a multidisciplinary team to produce performance-driven paid social, video, web, and lifecycle creative. Build scalable toolkits, AI-enabled workflows, and production systems that increase speed, personalization, and measurement while maintaining brand quality and creative excellence.

Top Skills: Agent-Powered SystemsAi Creative ToolsDisplay AdvertisingDrtvPaid SocialPmm ToolkitsSemStreamingVideo Production

Tapestry - Coach and Kate Spade

Supervisor II

An Hour Ago

Hybrid

50K-70K Annually

Mid level

50K-70K Annually

Mid level

eCommerce • Fashion • Retail • Sales • Wearables • Design

Lead and coach store staff, manage sales floor and stockroom operations, ensure excellent customer service, develop direct reports and build effective teams, and perform physical tasks (lifting, bending, climbing) as needed to meet store performance goals.

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Together AI

Technical Account Manager (TAM), GPU Cluster

Together AI San Francisco, California, USA Office

Similar Jobs

Staff Software Engineer

Creative Director

Supervisor II

What you need to know about the San Francisco Tech Scene

Key Facts About San Francisco Tech