About this role:
Wells Fargo is seeking a Senior Software Engineer - LLM Inferencing & AI Gateway to join our Digital Technology - AI Capability Engineering team. In this role, you will design, build, and operate the GPU-based GenAI platform and the serving infrastructure for LLM/SLM workloads. Your work will span the full stack-from GPU cluster configuration and Run:AI/OpenShift AI orchestration to optimizing vLLM/Triton runtimes and hardening these systems for production use.
Key focus areas include H100/H200 GPU clusters, NVLink/NVSwitch, MIG, CUDA/NVML, GPU scheduling, and disaggregated inferencing patterns (prefill/decode). You will also drive observability best practices and deliver reliable, scalable model endpoints through an API Gateway-based production architecture.
In this role, you will:
Reflected is the base pay range offered for this position. Pay may vary depending on factors including but not limited to demonstrated examples of prior performance, skills, experience, or work location. Employees may also be eligible for incentive opportunities.
$100,000.00 - $196,000.00
Benefits
Wells Fargo provides eligible employees with a comprehensive set of benefits, many of which are listed below. Visit Benefits - Wells Fargo Jobs for an overview of the following benefit plans and programs offered to employees.
26 Jan 2026
* Job posting may come down early due to volume of applicants.
We Value Equal Opportunity
Wells Fargo is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other legally protected characteristic.
Employees support our focus on building strong customer relationships balanced with a strong risk mitigating and compliance-driven culture which firmly establishes those disciplines as critical to the success of our customers and company. They are accountable for execution of all applicable risk programs (Credit, Market, Financial Crimes, Operational, Regulatory Compliance), which includes effectively following and adhering to applicable Wells Fargo policies and procedures, appropriately fulfilling risk and compliance obligations, timely and effective escalation and remediation of issues, and making sound risk decisions. There is emphasis on proactive monitoring, governance, risk identification and escalation, as well as making sound risk decisions commensurate with the business unit's risk appetite and all risk and compliance program requirements.
Applicants with Disabilities
To request a medical accommodation during the application or interview process, visit Disability Inclusion at Wells Fargo .
Drug and Alcohol Policy
Wells Fargo maintains a drug free workplace. Please see our Drug and Alcohol Policy to learn more.
Wells Fargo Recruitment and Hiring Requirements:
a. Third-Party recordings are prohibited unless authorized by Wells Fargo.
b. Wells Fargo requires you to directly represent your own experiences during the recruiting and hiring process.
Wells Fargo is seeking a Senior Software Engineer - LLM Inferencing & AI Gateway to join our Digital Technology - AI Capability Engineering team. In this role, you will design, build, and operate the GPU-based GenAI platform and the serving infrastructure for LLM/SLM workloads. Your work will span the full stack-from GPU cluster configuration and Run:AI/OpenShift AI orchestration to optimizing vLLM/Triton runtimes and hardening these systems for production use.
Key focus areas include H100/H200 GPU clusters, NVLink/NVSwitch, MIG, CUDA/NVML, GPU scheduling, and disaggregated inferencing patterns (prefill/decode). You will also drive observability best practices and deliver reliable, scalable model endpoints through an API Gateway-based production architecture.
In this role, you will:
- Lead complex Generative AI initiatives and deliverables within technical domain environments
- Contribute to large scale planning of strategies
- Design, code, test, debug, and document for projects and programs associated with technology domain, including upgrades and deployments
- Review moderately complex technical challenges that require an in-depth evaluation of technologies and procedures
- Resolve moderately complex issues and lead a team to meet existing client needs or potential new clients needs while leveraging solid understanding of the function, policies, procedures, or compliance requirements
- Collaborate and consult with peers, colleagues, and mid-level managers to resolve technical challenges and achieve goals
- Lead projects and act as an escalation point, provide guidance and direction to less experienced staff
- Engineer GPUs clusters and node pools; configure NVLink/NVSwitch, NVIDIA GPU Operator, MIG profiles, container runtime, and kernel/driver baselines for high-throughput LLM/SLM workloads.
- 4+ years of Software Engineering experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
- 1+ years of experience with GPU Inference including NVIDIA CUDA, cuDNN, NVLink/NVSwitch, MIG, NIXL, GPU profiling, and performance tuning on H100/H200 architectures
- 1+ years of experience with GPU orchestration platforms, such as RunAI (collections, queues, quotas, preemption, fair-share scheduling), OpenShift AI (RHOAI), and cluster administration on OCP or GKE
- 1+ years of experience with LLM/SLM serving frameworks, including vLLM, Triton, TensorRT-LLM/MII, KV-cache optimization strategies, and FP8/INT4 quantization techniques (AWQ/GPTQ)
- 1+ years of experience working with LLM API gateways, including OAuth2/mTLS authentication, rate-limiting and quota management, OpenAPI/SDK integration, SLAs, and versioning/deprecation practices
- 2+ years of experience in Generative AI engineering, including LLM/SLM operations, fine-tuning, evaluation pipelines, and developing model-specific performance optimization recipes
- 4+ years of experience in Python, including scripting, automation, and model/inference-related development
- Hybrid onsite at required locations
- No visa sponsorship available
- No relocation assistance for this position
Reflected is the base pay range offered for this position. Pay may vary depending on factors including but not limited to demonstrated examples of prior performance, skills, experience, or work location. Employees may also be eligible for incentive opportunities.
$100,000.00 - $196,000.00
Benefits
Wells Fargo provides eligible employees with a comprehensive set of benefits, many of which are listed below. Visit Benefits - Wells Fargo Jobs for an overview of the following benefit plans and programs offered to employees.
- Health benefits
- 401(k) Plan
- Paid time off
- Disability benefits
- Life insurance, critical illness insurance, and accident insurance
- Parental leave
- Critical caregiving leave
- Discounts and savings
- Commuter benefits
- Tuition reimbursement
- Scholarships for dependent children
- Adoption reimbursement
26 Jan 2026
* Job posting may come down early due to volume of applicants.
We Value Equal Opportunity
Wells Fargo is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, national origin, disability, status as a protected veteran, or any other legally protected characteristic.
Employees support our focus on building strong customer relationships balanced with a strong risk mitigating and compliance-driven culture which firmly establishes those disciplines as critical to the success of our customers and company. They are accountable for execution of all applicable risk programs (Credit, Market, Financial Crimes, Operational, Regulatory Compliance), which includes effectively following and adhering to applicable Wells Fargo policies and procedures, appropriately fulfilling risk and compliance obligations, timely and effective escalation and remediation of issues, and making sound risk decisions. There is emphasis on proactive monitoring, governance, risk identification and escalation, as well as making sound risk decisions commensurate with the business unit's risk appetite and all risk and compliance program requirements.
Applicants with Disabilities
To request a medical accommodation during the application or interview process, visit Disability Inclusion at Wells Fargo .
Drug and Alcohol Policy
Wells Fargo maintains a drug free workplace. Please see our Drug and Alcohol Policy to learn more.
Wells Fargo Recruitment and Hiring Requirements:
a. Third-Party recordings are prohibited unless authorized by Wells Fargo.
b. Wells Fargo requires you to directly represent your own experiences during the recruiting and hiring process.
Top Skills
Cudnn
Mig
Nixl
Nvidia Cuda
Nvlink
Nvswitch
Openshift Ai
Python
Runai
Tensorrt-Llm
Triton
Vllm
Wells Fargo San Francisco, California, USA Office
420 Montgomery St, San Francisco, CA, United States, 94103
Similar Jobs at Wells Fargo
Fintech • Financial Services
The Associate Personal Banker will build customer relationships, assist with account openings, and offer bank products while ensuring compliance with regulations.
Fintech • Financial Services
Oversee branch practices ensuring compliance with regulations, supervise financial advisors, and provide training while managing risks and resources.
Top Skills:
Finra Series 10Finra Series 63Finra Series 66Finra Series 7Finra Series 9
Fintech • Financial Services
The Lead Product Owner will drive the backlog for the Supply Chain Finance platform, collaborating with multiple stakeholders, ensuring high-quality agile delivery, and incorporating risk and compliance requirements.
Top Skills:
AgileSdlc
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

