The Senior SRE will deploy and operate commercial SaaS platforms, utilizing advanced skills in cloud infrastructure, automation, and systems engineering while promoting efficiency and reliability.
Manufacturing advanced electronics requires understanding millions of signals generated across complex assembly processes. Instrumental builds systems that capture and analyze those signals — images, test results, and process data — enabling engineers to discover failures, identify root causes, and deploy production controls that improve yield and product maturity. Leading companies such as NVIDIA, Cisco, and Meta rely on Instrumental to accelerate new product development and scale manufacturing across global factories. Instrumental has become mission-critical for manufacturers building and scaling the next generation of AI infrastructure hardware.
The Instrumental platform collects, intelligently transforms, and contextually presents manufacturing data to technical end-users, enabling them to optimize their manufacturing process in real-time. Our core technology is proprietary ML algorithms, packaged in an accessible, user-centric user interface – we believe we must have both the best technology and the best access to that technology to win.
Requirements:
- 5 or more years of DevOps or SRE experience deploying and operating commercial SaaS platforms on public cloud infrastructure, AWS preferred.
- Expert knowledge with Linux, shell, containerization, Kubernetes, IaC (terraform preferred), monitoring, logging, and APM tools.
- Proven ability to take initiative and drive impactful projects to completion efficiently and independently.
- Comfort with ambiguity, pace, and frequent pivots inherent in a startup environment, with a track record of creating clarity for teams.
- Experience introducing and integrating AI tools/processes into development and operation workflows.
- Demonstrated skill in setting, iterating on, and measuring KPIs to ensure ongoing performance, reliability and efficiency.
- Network/application security and compliance experience is a plus.
Who You Are:
- Dead serious about performance, scalability, and reliability (PSR): You care deeply about how systems behave in the real world and sweat the details around latency, uptime, and scale.
- Systems engineering & infrastructure expertise: You’ve spent real time building and running distributed systems and know your way around cloud infrastructure, networks, and operating systems.
- Automation, automation, automation: If something is repetitive or error-prone, your first instinct is to automate it and make it disappear.
- Operating in ambiguity & high-growth environments: You’re comfortable making good calls without perfect information and adapting as the system and company grow fast.
- Dependable, trustworthy: People trust you to own problems, show up when things are broken, and follow through.
This position requires access to items and data that are developed under U.S. government contracts and subject to dissemination controls that limit access to U.S. citizens only.
We’re a growing team that works collaboratively, is supportive of each other, and is highly energized by the opportunity for a large impact. We actively work to promote an inclusive environment, valuing passion and the ability to learn. You’re encouraged to apply even if your experience doesn’t precisely match the job description!
The following is a representative annual base salary range for this position within the Bay Area: $158-225k. We consider candidates at multiple levels for this role. Job level and salary opportunities are evaluated through our interview process – we review the experience, knowledge, skills, and abilities of each applicant.
Instrumental is proud to offer a highly-rated variety of benefits, including health, vision, dental, commuter plans, and parental leave.
Top Skills
Apm Tools
AWS
Containerization
Kubernetes
Linux
Logging Tools
Monitoring Tools
Shell
Terraform
Instrumental Palo Alto, California, USA Office
909 Alma Street, Palo Alto, CA, United States, 94301
Similar Jobs
Cloud • Software
Responsible for maintaining FedRAMP-compliant infrastructure, collaborating with software engineers, and ensuring system availability and security. Duties include infrastructure design, automation, monitoring, and incident response.
Top Skills:
AWSGoKubernetesPuppetPythonTerraform
Big Data • Cloud • Software • Database
Develop and maintain Kubernetes runtime environments, support developers, resolve critical issues, and participate in on-call rotations for production systems.
Top Skills:
AWSAzureCert-ManagerCorednsCrdsCriCsiGatekeeperGCPGoHelmKubernetesKustomizeOperatorsPythonTerraform
Fintech • Software
The Senior Site Reliability Engineer ensures fast, stable SaaS products through automation, collaboration, monitoring, and implementing AI tools to enhance performance and reliability.
Top Skills:
Ai ToolsAnsibleAppdynamicsAWSAzureAzure DevopsBashC# .NetCosmosDatadogDynatraceHarnessJavaJenkinsKubernetesNew RelicPowershellPythonSaaSSQLTerraform
What you need to know about the San Francisco Tech Scene
San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.
Key Facts About San Francisco Tech
- Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
- Major Tech Employers: Google, Apple, Salesforce, Meta
- Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
- Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
- Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
- Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine



