The Home Depot

Staff Software Engineer, Reliability Engineer - Observability (Remote)

Sorry, this job was removed at 07:48 p.m. (PST) on Friday, Jun 13, 2025

Remote

Hiring Remotely in Georgia, USA

120K-190K Annually

Remote

Hiring Remotely in Georgia, USA

120K-190K Annually

Similar Jobs

CrowdStrike

Cryptography Engineer - Product Security, Cybersecurity (Remote)

31 Minutes Ago

Remote or Hybrid

USA

120K-180K Annually

Mid level

120K-180K Annually

Mid level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

As a Cryptography Engineer, you will design and maintain cryptographic solutions, manage TLS lifecycle, integrate HSMs, and automate cryptographic operations, while providing technical leadership and mentoring junior engineers.

Top Skills: BashCryptographyGoHardware Security ModulesLinuxPythonTlsUnix

CrowdStrike

Sr. Security Advisor, Falcon Complete - MSP/MSSP (Remote)

31 Minutes Ago

Remote or Hybrid

USA

125K-180K Annually

Senior level

125K-180K Annually

Senior level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

The Falcon Complete Security Advisor oversees partner security postures, ensures alignment with standards, and enhances partner satisfaction through technical guidance and collaboration with internal teams.

Top Skills: APIsCybersecurityIncident ResponseLinuxmacOSSIEMThreat DetectionWindows

CrowdStrike

Sales Engineer

31 Minutes Ago

Remote or Hybrid

USA

75K-115K Annually

Mid level

75K-115K Annually

Mid level

Cloud • Computer Vision • Information Technology • Sales • Security • Cybersecurity

As a Corporate Sales Engineer at CrowdStrike, you will engage with potential customers, articulate cybersecurity solutions, manage proof of values, and collaborate with sales teams to secure new clients.

Top Skills: AWSAzureBashGCPPowershellPython

With a career at The Home Depot, you can be yourself and also be part of something bigger.

Position Purpose:

The Staff Reliability Engineer – Observability is responsible for leading the design, implementation, and evolution of observability solutions that ensure the reliability, performance, and efficiency of our systems. As a Staff Reliability Engineer, you will be part of a dynamic team with engineers of all experience levels who help each other build and grow technical and leadership skills while creating, deploying, and supporting production applications.
As a Staff Reliability Engineer, you are expected to build and grow the skillsets of the more junior Engineers.

Key Responsibilities:

50% Delivery and Execution - Develops, tests, deploys, and maintains software, with a clear understanding of the value the software is to provide; Takes a broad view when approaching issues; using a global lens; Consistently achieves results, even under tough circumstances; Develops test suites (functional, destructive, etc) to enable success, rapid deployment of code to production; Takes on new opportunities and tough challenges with a sense of urgency, high energy and enthusiasm; Consistently achieves results, even under tough circumstances
10% Learns and Grows - Actively seeks ways to grow and be challenged using both formal and informal development channels; Learns through successful and failed experiment when tackling new problems
20% Plans and Aligns - Creates new and better ways for the organization to be successful; Delivers multi-mode communications that convey a clear understanding of the unique needs of different audiences; Works the Product Team to ensure user stories are developer ready, easy to understand and testable; Collaborates with other team members in agile processes; Relates openly and comfortably with diverse groups of people; Adapts approach and demeanor in real time to match the shifting demands of different situations
20% Supports and Enables - Fields questions from product and engineering teams; Helps grow junior engineers by providing guidance on modern software development frameworks, and leading technical discussions; Notes gaps on the team and provides suggestions for changes to make the team more productive

Direct Manager/Direct Reports:

This position typically reports to Software Engineer Manager or Sr. Manager
This position typically has 0 Direct Reports

Travel Requirements:

No travel required.

Physical Requirements:

Most of the time is spent sitting in a comfortable position and there is frequent opportunity to move about. On rare occasions there may be a need to move or lift light articles.

Working Conditions:

Located in a comfortable indoor area. Any unpleasant conditions would be infrequent and not objectionable.

Minimum Qualifications:

Must be eighteen years of age or older.
Must be legally permitted to work in the United States.

Preferred Qualifications:

3-5 years of relevant work experience in site reliability engineering or related field
Experience in monitoring and observability, including designing and implementing observability solutions using OpenTelemetry, Prometheus, and distributed tracing
Proficiency in cloud platforms (GCP preferred) and infrastructure as code (Terraform, Ansible)
Experience in programming languages such as, Go, Python, and Java
Experience with creating and executing unit, functional, destructive, and performance tests
Experience with modern debugging and root cause analysis techniques
Experience in designing systems for High Availability, Disaster Recovery, Performance, Efficiency, and Security
Experience in leading observability initiatives, including defining instrumentation standards and building monitoring dashboards
Hands-on experience implementing alerting thresholds and automated responses based on service level objectives (SLOs)
Strong experience with Kubernetes cluster management, optimization, and scaling
Expertise in container orchestration, including best practices for containerized application deployments and resource optimization
Experience designing, building, and maintaining scalable cloud infrastructure on GCP
Proficiency in automating routine operational tasks to reduce toil and improve efficiency
Familiarity with integrating observability-driven alerts with incident management systems and leading incident response efforts
Experience optimizing system performance, identifying and resolving bottlenecks, and conducting capacity planning
Knowledge of database performance tuning, query optimization, and designing application stress testing methodologies
Familiarity with service mesh technologies (Istio, Linkerd)

Minimum Education:

The knowledge, skills and abilities typically acquired through the completion of a bachelor's degree program or equivalent degree in a field of study related to the job.

Preferred Education:

No additional education

Minimum Years of Work Experience:

Preferred Years of Work Experience:

No additional years of experience

Minimum Leadership Experience:

None

Preferred Leadership Experience:

None

Certifications:

None

Competencies:

Global Perspective
Manages Ambiguity
Nimble Learning
Self-Development
Collaborates
Cultivates Innovation
Situational Adaptability
Communicates Effectively
Drives Results
Interpersonal Savvy

For California, Colorado, Connecticut, Rhode Island, Nevada, New York City, Ithaca (NY), Westchester County (NY), and Washington residents:

The pay range for this position is between $120,000 - $190,000

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
Major Tech Employers: Google, Apple, Salesforce, Meta
Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine