Engineering Manager, Site Reliability
"The front page of the internet,” Reddit brings over 500 million people together each month through their common interests, inviting them to share, vote, comment, and create across thousands of communities. Come for the cats, stay for the empathy.
Reddit is building a top tier SRE organization, and is looking for engineering managers to help shape and grow it from its existing core.
This is a high impact role where you will drive technical roadmaps, operations philosophy, architecture review, and execution for one of the largest sites in the world. The ideal candidate understands the value of an engineering and metric centered approach to reliable service support, roots out toil wherever it may live, and knows that “Hope is not a strategy.”
What You’ll Do
- Build, hire and lead a high-calibre team of Site Reliability Engineers to act as a source of focused expertise, and a force multiplier for Reddit’s product engineering.
- Support multiple Reddit product teams with expertise and engineering development to optimize availability, latency, performance, efficiency, change management, monitoring, emergency response, and capacity planning.
- Lead by example, care for the team, and establish credibility with the quality of the team's technical execution.
- Drive a cycle of virtuous improvement with blame-free postmortems.
- Coach and mentor engineers to support the distribution of best practices across Reddit as a whole.
What We Look For
- 3+ years experience of managing a team of software engineers and/or SREs.
- 5+ years of experience developing cloud and internet-scale systems.
- Software development experience in one or more of: Python, Java, Go, C++, Rust, etc.
- Strong preference is given for deep experience with any of:
- Cloud infrastructure (AWS, GCE)
- Kubernetes
- Metrics, monitoring, and alerting systems
- CI/CD automation
Additionally We'd Like
- Strong track record of managing a team including hiring, onboarding, and professional development.
- Strong organizational skills, the ability to prioritize tasks and keep projects on schedule.
- Expertise in problem solving and analyzing and troubleshooting systems
- BS degree in Computer Science, similar technical field of study or equivalent practical experience.