Site Reliability Engineer, Infrastructure
As SRE's at Meraki we are responsible for building the reliable and scalable cloud infrastructure that supports millions of Meraki devices across the world. Meraki’s customer base has grown by a factor of 2-3 every year, serving more than 8 billion HTTP requests per day across eight data centers. Our customers depend on our products to run their critical infrastructure of network switches, security appliances, wireless APs and security cameras. We’re passionate about using automation to raise the bar!
In this role you will join the Infrastructure SRE team that is based out of our offices in Sydney, San Francisco and London. The Infrastructure Site Reliability Engineering team's mission is to make Infrastructure as a Service (IaaS) a reality at Cisco Meraki and to ensure it meets the needs of the various platforms supported within Software Engineering. This typically includes areas such as operating system, compute (virtualized or traditional bare metal servers), storage, security, networking, network & infrastructure support services, and the vendor management of our physical sites.
You will responsible for the design, development and operational aspects of the global infrastructure which supports our private cloud. We believe in automating manual tasks with the right tools. This involves designing, building and running automated systems written in ruby. You will work closely with our existing vendors to coordinate all hands on work. We embrace the *nix way, automate away tedious tasks and strive to build infrastructure as code whenever possible.
Projects include:
- Designing and deploying new IaaS architecture to provide private cloud to internal partners by using tools like OpenStack.
- Designing and deploying tooling and framework to facilitate the transition to a hybrid-cloud world.
- Building an automated service lifecycle platform to lead the full lifecycle of all infrastructure (server, storage, network and site).
- Developing comprehensive monitoring tools that provide visibility into the performance and reliability of our infrastructure.
- Automating testing infrastructure to accelerate the velocity at which we can deploy changes.
You are an ideal candidate if you:
- 3+ years of work experience in software development, particularly in cloud systems, networking, distributed systems, databases, and data processing frameworks
- Script or code with 1-2 languages like Ruby, Scala, Python or Bash. You are comfortable digging into other people’s source code in search of the root cause of a problem and automate all the things.
- Have previous experience working with cloud management platforms: OpenStack, CloudStack, etc.
- Have experience working on production systems where you responded to issues to minimize customer downtime. This role requires being part of a workday on-call rotation.
- Believe in the Unix way. You build large systems out of small components that each do one job and do it well. We run Debian.
Bonus points for:
- Experience with SRE/dev-ops/infrastructure tasks
- Experience with private cloud management platforms (OpenStack)
- Interesting personal projects or contributions to open-source projects
- A BS/MS/Ph.D in Computer Science, Computer Engineering, or a STEM field
Cisco is an Affirmative Action and Equal Opportunity Employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, national origin, genetic information, age, disability, veteran status, or any other legally protected basis. Cisco will consider for employment, on a case by case basis, qualified applicants with arrest and conviction records.
At Cisco Meraki, we’re challenging the status quo with the power of diversity, inclusion, and collaboration. When we connect different perspectives, we can imagine new possibilities, inspire innovation, and release the full potential of our people. We’re building an employee experience that includes appreciation, belonging, growth, and purpose for everyone.