D-Wave Systems Logo

D-Wave Systems

Senior Site Reliability Engineer

Reposted 6 Days Ago
Remote
2 Locations
124K-186K Annually
Senior level
Remote
2 Locations
124K-186K Annually
Senior level
As a Senior Site Reliability Engineer, you will ensure the reliability of SaaS products and infrastructure, collaborate with teams for monitoring and alerting, and support incident management and cross-training efforts.
The summary above was generated by AI

D-Wave (NYSE: QBTS)D-Wave is a leader in the development and delivery of quantum computing systems, software, and services. We are the world’s first commercial supplier of quantum computers, and the only company building both annealing and gate-model quantum computers. Our mission is to help customers realize the value of quantum, today. Our quantum computers — the world’s largest — feature QPUs with sub-second response times and can be deployed on-premises or accessed through our quantum cloud service, which offers 99.9% availability and uptime. More than 100 organizations trust D-Wave with their toughest computational challenges. With over 200 million problems submitted to our quantum systems to date, our customers apply our technology to address use cases spanning optimization, artificial intelligence, research and more. Learn more about realizing the value of quantum computing today and how we’re shaping the quantum-driven industrial and societal advancements of tomorrow: www.dwavequantum.com.

 

You can read more about our company and our innovations in the pages of The Wall Street Journal, Time Magazine, Fast Company, MIT Technology Review, Forbes, Inc. Magazine, Wired and across many whitepapers. 

  

At D-Wave, we’re helping customers realize the value of quantum computing today and are shaping the quantum-driven industrial and societal advancements of tomorrow.


About the role

We are seeking a talented and experienced Senior Site Reliability Engineer (SRE) to join our DevOps team. As a key member of the team, you will be responsible for the reliability of our SaaS product, our research laboratory, and the infrastructure supporting our production quantum computers worldwide. You will play a critical role in ensuring the reliability, scalability, and performance of our company’s systems and infrastructure. The ideal candidate will have a strong background in systems administration, automation and troubleshooting complex distributed systems.


What you'll do

  • Refine, refactor, and evolve monitoring systems and related tools covering our workloads in AWS, GCP, on-premises, and remote field systems across the world
  • Work with teams including software and hardware engineering, processor development, cryogenics, and customer support to elicit requirements, collect and store metrics, analyze trends, and provide dashboards and other tooling to enable observability across the organization
  • Own the alerting with other SREs to support infrastructure and on-call management systems and ensure alerting is reliable and scalable
  • Work closely with the DevOps on and Test Engineering teams to enable instrumenting builds and deploys to ensure reliability through every step of the software development lifecycle

About You

  • 4+ years of experience operating and troubleshooting SaaS/PaaS applications and environments on a major cloud platform – AWS and GCP preferred – including platform-specific monitoring technologies like Cloudwatch and Stackdriver
  • 4+ years of experience with high level SRE work including incident management, process design, managing on-call rotations (with PagerDuty), and cross-training new and existing employees
  • Experience with on-premises compute, including servers, storage, power, virtualization, and networking equipment, including specifically using SNMP to monitor networked devices
  • 4+ years of experience with AOS/Elasticsearch/Loki or similar log management tools
  • Experience with time series databases like Prometheus/InfluxDB, document stores like MongoDB, and classic relational databases like PostgreSQL, AWS Redshift, etc.
  • Proficiency in InfluxQL and PromQL
  • Significant expertise supporting and integrating analytics and monitoring systems such as ELK, Grafana, Prometheus, Zabbix, LibreNMS, Intermapper, etc.
  • At least two years of programming experience in Python, Go, Bash, Ruby, or equivalent
  • Degree in Computing Science, Engineering or equivalent education and experience
  • Excellent oral and written communication skills – you like to document your work!

Bonus Points

  • 3+ years specific experience with Elasticsearch / AWS OpenSearch, Fluent, Grafana Cloud
  • Experience with Kubernetes monitoring
  • Experience with producing synthetic metrics and instrumenting existing applications and platforms to extract metrics for analysis
  • Experience with OpenTelemetry
  • Proven record of cross-training and evangelizing observability as a critical aspect of all systems

A D-Waver's DNA

  • We look at the future and say “why not”; we see possibilities where others see problems or routines. We show the way ahead and are committed to achieving ambitious goals.
  • We practice straight talk and listen generously to each other with empathy. We value different opinions and points of views. We ensure that we connect outside as well as inside to learn from others and inspire each other.
  • We hold ourselves accountable for delivering results. We make decisions & take responsibility so that we can act & support each other.
  • As leaders we motivate & engage our teams to undertake beyond what they originally thought possible, by developing our teams & creating the conditions for people to grow and empower themselves through enabling & coaching.

Our Compensation Philosophy is Simple but Powerful:

We believe providing D-Wavers with company ownership, competitive pay, and a range of meaningful benefits is the start of creating a culture where people want to give the best they’ve got — not because they’re simply making money, but because they’ve fallen in love with our vision, mission, values, and team. 


During the interview process, your Recruiter will review our total rewards (base, equity, bonus, perks, benefit, culture) offerings. The final offer is determined by your proficiencies within this level.   


Inclusion: 

We celebrate diverse perspectives to drive innovation in our pursuit. Our employees range from distinguished domain experts with decades of experience in their respective fields, to bright and motivated graduates eager to make their mark. Our diverse and innovative team will make you feel appreciated, supported and empower your career growth at D-Wave.


The Fine Print: 

No 3rd party candidates will be accepted


It is D-Wave Systems Inc. policy to provide equal employment opportunity (EEO) to all persons regardless of race, color, religion, sex, national origin, age, sexual orientation, gender identity, genetic information, physical or mental disability, protected veteran status, or any other characteristic protected by federal, state/provincial, local law. 


The base pay range for this role is:

124,364 - 185,545 USD (Remote, United States)

124,364 - 185,545 CAD (Remote, Canada)

Top Skills

AWS
Aws Redshift
Bash
Elasticsearch
GCP
Go
Grafana
Influxdb
Librenms
MongoDB
Opentelemetry
Postgres
Prometheus
Python
Ruby
Snmp
Zabbix

D-Wave Systems Palo Alto, California, USA Office

2650 E Bayshore Rd, Palo Alto, CA, United States, 94303

Similar Jobs

8 Days Ago
Easy Apply
Remote
3 Locations
Easy Apply
200K-220K Annually
Senior level
200K-220K Annually
Senior level
Software
The role is focused on enhancing self-hosted solutions, ensuring system reliability, mentoring engineers, and optimizing deployments, primarily using Kubernetes and related technologies.
Top Skills: DockerFluentdKubernetesOpen TelemetryPostgresRuby on RailsReactRedisRubyTypescript
14 Days Ago
In-Office or Remote
Toronto, ON, CAN
Senior level
Senior level
Insurance
The Senior Site Reliability Engineer at Zensurance will focus on enhancing production systems' reliability, scalability, and performance through automation, best practices, and incident management, while mentoring junior engineers.
Top Skills: AWSDatadogElk StackGithub ActionsGrafanaKubernetesPrometheusSplunkTerraformTypescript
15 Days Ago
In-Office or Remote
2 Locations
98K-155K Annually
Senior level
98K-155K Annually
Senior level
Information Technology
The Senior Site Reliability Engineer will focus on maintaining system reliability, optimizing cloud infrastructure, and implementing monitoring and performance metrics.
Top Skills: Cloud InfrastructureKubernetesMonitoring ToolsNetworkingSite Reliability EngineeringSystem Administration

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account