TensorWave Logo

TensorWave

Site Reliability Engineer

Reposted Yesterday
Remote
Hiring Remotely in USA
Senior level
Remote
Hiring Remotely in USA
Senior level
The Senior SRE Engineer will design, build, and maintain resilient infrastructure systems, manage infrastructure-as-code, and write tooling in various languages.
The summary above was generated by AI

At TensorWave, we’re leading the charge in AI compute, building a versatile cloud platform that’s driving the next generation of AI innovation. We’re focused on creating a foundation that empowers cutting-edge advancements in intelligent computing, pushing the boundaries of what’s possible in the AI landscape.

About the Role:

We're looking for a Senior SRE Engineer with a strong software engineering background to build and maintain highly scalable, secure, and resilient infrastructure. You’ll play a critical role in designing low-level systems, automating infrastructure with modern tooling, and ensuring platform reliability. This role is ideal for someone who’s comfortable working at the intersection of systems programming and DevOps—writing code in Go, Javascript, Rust, C, or Zig while also managing infrastructure with NixOS, Kubernetes, and Terraform.

Responsibilities:
  • Design, build, and maintain infrastructure systems using Linux and NixOS.

  • Manage infrastructure-as-code with Terraform to provision and scale resources.

  • Architect and operate Kubernetes clusters with a focus on performance, security, and automation.

  • Write high-performance tooling and internal utilities in Go, Javascript, Rust.

  • Develop and maintain CI/CD pipelines for infrastructure and code deployments.

  • Monitor system performance, resolve issues, and improve reliability through observability tooling.

  • Collaborate closely with engineering teams to support deployment strategies and development workflows.

Essential Skills & Qualifications:
  • 5+ years in DevOps, Site Reliability, or Infrastructure Engineering roles.

  • Deep experience with Linux systems and configuration management (preferably NixOS).

  • Hands-on experience with Terraform, Kubernetes, and containerized environments.

  • Proficiency in one or more low-level languages: Rust, C, Zig, Javascript, and Go.

  • Strong understanding of systems programming, performance tuning, and operating system internals.

  • Familiarity with CI/CD practices and infrastructure monitoring/alerting tools.

We’re looking for resilient, adaptable people to join our team—folks who enjoy collaborating and tackling tough challenges. We’re all about offering real opportunities for growth, letting you dive into complex problems and make a meaningful impact through creative solutions. If you're a driven contributor, we encourage you to explore opportunities to make an impact at TensorWave. Join us as we redefine the possibilities of intelligent computing.

What We Bring:
  • Stock Options

  • 100% paid Medical, Dental, and Vision insurance

  • Life and Voluntary Supplemental Insurance

  • Short Term Disability Insurance

  • Flexible Spending Account

  • 401(k)

  • Flexible PTO

  • Paid Holidays

  • Parental Leave

  • Mental Health Benefits through Spring Health

Top Skills

C
Go
JavaScript
Kubernetes
Nixos
Rust
Terraform
Zig

Similar Jobs

3 Days Ago
Easy Apply
Remote
United States
Easy Apply
200K-275K Annually
Senior level
200K-275K Annually
Senior level
Big Data • Fintech • Mobile • Payments • Financial Services
The Staff Software Engineer in SRE is responsible for setting technical strategy, ensuring system availability, guiding incident management, and fostering talent within the team to enhance overall system reliability.
Top Skills: AWSBashKotlinKubernetesMySQLPythonSpark
10 Days Ago
Remote
United States
84K-144K Annually
Senior level
84K-144K Annually
Senior level
Artificial Intelligence • Cloud • Consumer Web • eCommerce • Information Technology • Software
The Site Reliability Engineer will ensure application performance, architect monitoring tools, analyze systems, provide reliability recommendations, and support production.
Top Skills: AnsibleCentosDatadogDockerLinuxMySQLNew RelicRhelSQL
14 Days Ago
Remote or Hybrid
United States
165K-235K Annually
Mid level
165K-235K Annually
Mid level
Big Data • Cloud • Productivity • Software • Database • Analytics • Automation
The Site Reliability Engineer will support engineering teams, enhance system resilience, and drive scalable infrastructure practices.
Top Skills: Aws ServicesGrafanaHoneycombLinuxPythonTerraform

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account