Graphcore Logo

Graphcore

Technical Lead - System Validation Architect

Posted Yesterday
Be an Early Applicant
Hybrid
Austin, TX
Senior level
Hybrid
Austin, TX
Senior level
Lead the architecture and execution of Linux-based validation frameworks for Arm-based data center SoCs, defining validation strategy and ensuring system quality.
The summary above was generated by AI
About us

Graphcore is one of the world’s leading innovators in Artificial Intelligence compute.
It is developing hardware, software and systems infrastructure that will unlock the next generation of AI breakthroughs and power the widespread adoption of AI solutions across every industry.
As part of the SoftBank Group, Graphcore is a member of an elite family of companies responsible for some of the world’s most transformative technologies. Together, they share a bold vision: to enable Artificial Super Intelligence and ensure its benefits are accessible to everyone.
Graphcore’s teams are drawn from diverse backgrounds and bring a broad range of skills and perspectives. A melting pot of AI research specialists, silicon designers, software engineers and systems architects, Graphcore enjoys a culture of continuous learning and constant innovation.

Job Summary

We are seeking a Technical Lead – System Validation Architect to lead the architecture and execution of Linux-based validation frameworks for Arm-based data center SoCs. This role will define validation strategy, test coverage, and methodology across CPU, memory, interconnect, and high-speed I/O subsystems. You will provide technical leadership in validation architecture, automation, benchmarking, and debug to ensure robust system quality and scalability.

The Team

The Systems Validation Architecture team is responsible for defining and enabling scalable validation methodologies for Graphcore’s next-generation AI compute platforms. The team collaborates closely with hardware, firmware, and systems engineering groups to deliver comprehensive validation coverage and high-quality system enablement.

Responsibilities and Duties
  • Define end-to-end validation strategy and coverage model:
    • Functional, stress, performance, and corner-case testing
  • Translate hardware specifications into structured, parameterized test plans
  • Guide the team in:
    • Selecting appropriate tools.
    • Defining workload models and parameter configurations
  • Establish standards for:
    • Test case definition (parameters, metrics, pass/fail criteria)
    • Result validation and reporting
  • Experience with multi-core and parallel programming, including workload scaling and CPU affinity management
  • Review Python-based automation, orchestration, and analysis
  • Collaborate with hardware, firmware, and system teams to debug issues
Candidate Profile

Essential:

  • Strong knowledge of Arm SoC architecture and Linux systems.
  • 8+ years of experience in system validation, performance engineering, or low-level systems development.
  • Deep understanding of CPU architecture, cache coherency, memory systems (DDR, HBM, NUMA), and high-speed I/O technologies such as PCIe.
  • Proven ability to define validation strategies, coverage models, and validation methodologies.
  • Hands-on experience using and tuning benchmarking tools such as stress-ng, fio, and iperf.
  • Strong Python programming skills for process automation, system coordination, and data examination.  
  • Experience working with performance analysis software including perf and PMU counters.  
  • Strong analytical, problem-solving, and ability to collaborate in multi-functional environments.  

Desirable:

  • Experience working with large-scale or data center systems.
  • Strong programming skills in C/C++ and Python for system-level development.
  • Previous technical leadership or mentoring experience.
  • Experience with scalable validation infrastructure and automation frameworks.
  • Knowledge of AI infrastructure or hyperscale compute systems.

Similar Jobs at Graphcore

19 Hours Ago
Hybrid
Milpitas, CA, USA
Expert/Leader
Expert/Leader
Artificial Intelligence • Semiconductor
Develop and manage software interfaces for rack management solutions, ensuring robust performance in AI systems and infrastructure. Collaborate across teams to enhance operational efficiency, employing skills in cloud-native environments and troubleshooting.
Top Skills: AnsibleBashCephCi/CdDockerElasticsearchFluentdGithub ActionsGitlabGoGrafanaInfrastructure-As-CodeKafkaKubernetesKvmLinuxLokiMimirOpen VswitchOpensearchOpentelemetryPrometheusPythonQemuRedfishRestful ApiSlurmTerraform
19 Hours Ago
Hybrid
Milpitas, CA, USA
Mid level
Mid level
Artificial Intelligence • Semiconductor
Lead cross-functional programs for AI networking infrastructure, ensuring alignment between network architecture, hardware development, and operational readiness. Manage supplier relationships and oversee the technical roadmap for scalable AI infrastructure delivery.
Top Skills: AIEthernetHigh-Speed Optical InterconnectNetworkingOptical ConnectivityPcieRdmaRocev2Silicon PhotonicsUcie
19 Hours Ago
Hybrid
Senior level
Senior level
Artificial Intelligence • Semiconductor
The AI Platform Architect will design a cohesive architecture for AI environments, oversee workload orchestration, eliminate system bottlenecks, and collaborate on hardware-software integration. Responsibilities include developing a 3-to-5-year technical vision for the AI platform and ensuring data flow between AI compute nodes and network fabrics is optimized.
Top Skills: AIDeepspeedHpcJSONKubernetesNvmePcie Gen 5/6PythonPyTorchRdmaSlurm

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account