NVIDIA Logo

NVIDIA

Senior Software Development Engineer in Test

Posted 2 Days Ago
Be an Early Applicant
In-Office
Santa Clara, CA, USA
168K-270K Annually
Senior level
In-Office
Santa Clara, CA, USA
168K-270K Annually
Senior level
Lead design, implementation, and automation of large-scale cloud and data center test infrastructure. Develop CI/CD pipelines, validate performance, scalability, and reliability, debug clusters (network, storage, security), manage Kubernetes and cloud environments, leverage AI tools to accelerate testing, and coordinate cross-team bring-up and issue resolution.
The summary above was generated by AI

We are seeking a highly skilled and hard-working Senior Test Developer / test engineer to join our multifaceted Enterprise Software QA team. This role offers an outstanding opportunity to leave your mark on the design, construction, optimization and testing of large-scale infrastructure for various foundational NVIDIA unified cloud services and data center offerings. If you are a dedicated engineer with strong expertise in cloud infrastructure and distributed systems and want to apply your skills with AI tools, this role could fit you perfectly. You will thrive in an exciting, innovative environment.

What you'll be doing:

  • Work with development teams on test plans for all layers of SW stack for cloud infrastructure, execution, reviews, failure analysis and assessing overall quality and risk. Work with customer PMs on software issues including technical feedback from OEMs and CSPs. Develop key benchmarks to track execution and deploy process improvements to improve efficiency

  • Leverage AI skills to expedite the test scope, test plan, execution and automation workflows.

  • Lead NVIDIA Cloud and Data Center bring up activities which will involve validation, reporting, working with engineering to debug issues, providing design input at times, adding coverage in different areas.

  • Design, develop and maintain CI/CD pipelines for continuous testing in cloud environments when needed.

  • Perform performance, scalability, and reliability testing of cloud services.

  • Implement and maintain test environments in cloud platforms such as AWS, Azure, or Google Cloud.

  • Supervise the infrastructure to alert on significant events, ensuring the highest level of system performance and reliability.

  • Work with various different partner teams to ensure availability of clusters to test on and take the lead in resolve all issues.

  • Working with teams to ensure quality of the cloud products getting delivered focusing on critical areas like security, storage, workloads, performance on latest SW and FW components.

What we need to see:

  • A Master's or Ph.D. in Computer Science or a related field, or equivalent experience.

  • Experience with AI development tools used in creating test cases, automating test cases, code coverage, triaging.

  • 8+ years of hands-on experience in cluster management and related tools, including Docker Containers, Slurm, Kubernetes, and Ansible.

  • 2+ years strong experience with cloud infrastructure platforms like AWS, Azure, Google, OCI Cloud.

  • Hands-on experience with network, storage, security, cluster configuration and debugging, cloud infrastructure management tools like terraform, ansible.

  • Expertise in administering, operating, and configuring Kubernetes.

  • Experience in CI/CD tools such as Gitlab and Jenkins and the GitOps model.

  • Proficiency in various monitoring tools :Prometheus, Grafana, Cloudwatch, and Thanos.

  • Proficiency in debugging issues involving networks, DHCP, DNS, HTTP, Linux, and containers.

Ways to Stand Out from the Crowd:

  • Familiarity with "Base Command Manager" for managing and monitoring high performance computing.

  • Experience in writing automation for web application using tools like selenium, playwright.

Your base salary will be determined based on your location, experience, and the pay of employees in similar positions. The base salary range is 168,000 USD - 270,250 USD.

You will also be eligible for equity and benefits.

Applications for this job will be accepted at least until July 6, 2026.

This posting is for an existing vacancy. 

NVIDIA uses AI tools in its recruiting processes.

NVIDIA is committed to fostering an inclusive work environment and proud to be an equal opportunity employer. As we highly value diversity in our current and future employees, we do not discriminate (including in our hiring and promotion practices) on the basis of race, religion, color, national origin, gender, gender expression, sexual orientation, age, marital status, veteran status, disability status or any other characteristic protected by law.

HQ

NVIDIA Santa Clara, California, USA Office

2701 San Tomas Expressway, Santa Clara, CA, United States, Santa Clara

NVIDIA San Francisco, California, USA Office

San Francisco, United States

NVIDIA San Jose, California, USA Office

San Jose, United States

Similar Jobs

2 Days Ago
In-Office
Santa Clara, CA, USA
168K-270K Annually
Senior level
168K-270K Annually
Senior level
Artificial Intelligence • Computer Vision • Hardware • Robotics • Metaverse
Design and implement large-scale automated test infrastructure and orchestration for confidential computing on NVIDIA GPUs. Develop test plans, automate test execution, improve coverage, collaborate across teams, and validate CUDA/driver features in distributed, heterogeneous environments using AI-assisted tools.
Top Skills: Ai ToolsAnsibleAutoconfAutomakeCC++Cloud InfrastructureCluster ManagementCmakeCross-CompilationCudaDockerHyper-VKvmLinuxMesonNvidia GpusOpenaccPythonVirtualizationXen
5 Days Ago
In-Office or Remote
United States
121K-219K Annually
Senior level
121K-219K Annually
Senior level
Cloud • Security • Software • Cybersecurity
Lead development of system test plans and automated system tests for multi-vendor routing and switching environments. Build and improve test automation and methodologies, validate Internet-scale routing protocols (BGP), review product and system designs for testability, and collaborate with Network Engineering and Architecture to reproduce real-world network scenarios and ensure platform reliability.
Top Skills: BgpCiscoFull Stack DevelopmentIp Network DesignJuniperPythonRouting ProtocolsTest Automation
22 Days Ago
In-Office
San Francisco, CA, USA
110K-120K Annually
Senior level
110K-120K Annually
Senior level
Fintech • Software
Design and develop scalable end-to-end testing frameworks for data stores, pipelines, services, and web front-ends. Automate data validation, perform API load testing, and implement system and data security tests. Build CI/CD test infrastructure, evaluate and improve automation frameworks, prototype tools, and contribute testing strategies in engineering architecture discussions.
Top Skills: Api Load TestingBddCi/CdData WarehouseETLGitJavaJavaScriptLocustPlaywrightPythonSQL

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account