Netskope Logo

Netskope

Staff Site Reliability Engineer

Reposted 11 Days Ago
Remote
Hiring Remotely in United States
Expert/Leader
Remote
Hiring Remotely in United States
Expert/Leader
The Staff Site Reliability Engineer will enhance AI/ML infrastructure, manage CI/CD pipelines, ensure system reliability, and troubleshoot applications, focusing on cloud-based operations.
The summary above was generated by AI
About Netskope

Today, there's more data and users outside the enterprise than inside, causing the network perimeter as we know it to dissolve. We realized a new perimeter was needed, one that is built in the cloud and follows and protects data wherever it goes, so we started Netskope to redefine Cloud, Network and Data Security. 

Since 2012, we have built the market-leading cloud security company and an award-winning culture powered by hundreds of employees spread across offices in Santa Clara, St. Louis, Bangalore, London, Paris, Melbourne, Taipei, and Tokyo. Our core values are openness, honesty, and transparency, and we purposely developed our open desk layouts and large meeting spaces to support and promote partnerships, collaboration, and teamwork. From catered lunches and office celebrations to employee recognition events and social professional groups such as the Awesome Women of Netskope (AWON), we strive to keep work fun, supportive and interactive.  Visit us at Netskope Careers. Please follow us on LinkedIn and Twitter@Netskope.

About the role

We are a team of software engineers focused on improving reliability, availability, latency, performance, efficiency, monitoring, emergency response, and capacity planning of the engineering stacks. If you are passionate about solving complex problems and developing cloud services at scale, we would like to speak with you.

As a SRE, you will be writing software to solve operational problems and drive cutting edge reliability and observability practices. Your expertise will also extend to setting up and maintaining monitoring, logging, and alerting systems to oversee extensive training runs and client-facing APIs. You will ensure that training environments are optimally available and efficiently managed across multiple clusters, enhancing our containerization and orchestration systems with advanced tools like Docker and Kubernetes.

  • Partner closely with service owners and engineers to develop reliable services driven by best practices
  • Develop software and tools to solve a variety of problems across service and infrastructure 
  • Set up and manage monitoring, logging, and alerting systems for extensive training runs and client-facing APIs.
  • Ensure training environments are consistently available and prepared across multiple clusters.
  • Develop and manage containerization and orchestration systems utilizing tools such as Docker and Kubernetes.
  • Improve reliability, quality, and time-to-market of our suite of software solutions
  • Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating for continual improvement
  • Provide primary operational support and engineering for multiple large-scale distributed software applications

Is this you?

  • Someone who works with a sense of ownership
  • Takes pride in building and operating scalable, reliable, secure systems
  • Are comfortable with ambiguity and change
  • You have a knack for troubleshooting complex systems and enjoy solving challenging problems
  • Proactive in identifying problems, performance bottlenecks, and areas for improvement
  • Has experience in working and collaborating with teams based across different geographies and time zones

Preferred skills and experience:

  • Software programming experience in any programming language
  • Good understanding of principles of distributed systems
  • Deep understanding of Kubernetes and Docker
  • Understanding of data technologies like Kafka, Yugabyte, Redis etc
  • Good understanding of AWS ecosystem
  • Basic understanding of networking
  • Exposure to Infrastructure as code tools like Terraform
  • Familiar with monitoring tools such as Prometheus, Grafana, or similar
  • 8+ years building core infrastructure

Nice to have experience

  • Experience in operating and monitoring services communicating across AWS and private clouds
  • Experience operating Kubernetes at scale

#LI-SC1

Netskope is committed to implementing equal employment opportunities for all employees and applicants for employment. Netskope does not discriminate in employment opportunities or practices based on religion, race, color, sex, marital or veteran statues, age, national origin, ancestry, physical or mental disability, medical condition, sexual orientation, gender identity/expression, genetic information, pregnancy (including childbirth, lactation and related medical conditions), or any other characteristic protected by the laws or regulations of any jurisdiction in which we operate.

Netskope respects your privacy and is committed to protecting the personal information you share with us, please refer to Netskope's Privacy Policy for more details.

Top Skills

AWS
Azure
Bash
Docker
Git
Git
GCP
Grafana
Huggingface Transformers
Kubernetes
Llm
Prometheus
Python
PyTorch
Tensorrt
Terraform

Netskope Santa Clara, California, USA Office

2445 Augustine Dr, Santa Clara, CA, United States, 95054

Similar Jobs

14 Days Ago
Remote or Hybrid
Orlando, FL, USA
Senior level
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
Support and maintain the reliability, scalability, and performance of cloud infrastructure for US Public Sector customers, utilizing software development and systems engineering skills. Resolve issues, mentor team members, and drive automation initiatives.
Top Skills: AnsibleAWSAzureBashDockerGCPGrafanaJavaJavaScriptKafkaKubernetesLinuxMaria DbMySQLNginxOpenstackOraclePostgresPrometheusPuppetPythonSplunkTerraform
22 Days Ago
Remote or Hybrid
New York, NY, USA
130K-180K Annually
Senior level
130K-180K Annually
Senior level
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
Oversee SAP BTP CPI operations, manage incidents, collaborate with teams for enhancement and deployment, ensuring system availability and performance.
Top Skills: AbapCapmCloud ConnectorCpiIdocJSONMessage QueuesOauthOdataRestSAMLSap BtpSfapiSftpSoapXML
19 Hours Ago
Remote or Hybrid
US
183K-245K Annually
Senior level
183K-245K Annually
Senior level
Artificial Intelligence • Cloud • Fintech • Machine Learning • Mobile • Software
The Staff Site Reliability Engineer will design, implement, and optimize infrastructure for AI services, ensure reliability and performance, and drive automation and observability excellence across engineering teams.
Top Skills: AzureAzure DevopsDockerElk StackGithub ActionsGrafanaKubernetesMimirPostgresPrometheusSQL ServerTeamcityTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account