Site Reliability Engineer

Sorry, this job was removed at 7:28 a.m. (PST) on Thursday, May 28, 2020
Find out who's hiring in San Francisco.
See all Developer + Engineer jobs in San Francisco
Apply
By clicking Apply Now you agree to share your profile information with the hiring company.

Who We Are

Nuro is a robotics start-up whose mission is to accelerate the benefits of robotics for everyday life. We have an elite team of entrepreneurs and engineers, designers, and scientists. We believe AI and robotics are at the cusp of transforming daily life and we are dedicated to building meaningful products with this technology. Join us and play a critical role in our mission.

About the Role

We are looking for a Site Reliability Engineer to join our software team and work closely with our engineers to deploy, manage and monitor highly scalable and available software systems and infrastructure.

About the Work

  • Maintain and support our data center infrastructure, storage and networking systems, infrastructure build-out in data center and cloud.
  • Be a gatekeeper of Nuro’s production softwares environment. Help evaluate the production readiness of Nuro software. Deploy and maintain both Nuro homegrown and third-party open source software on Nuro infrastructure.
  • Take an open source software and build a reliable and scalable software service from it. For example, storage cluster, deep learning framework and pubsub system.
  • Develop systems / components to improve operation simplicity and system reliability. For example, monitoring system, deployment automation and diagnosis tools.
  • Troubleshoot and fix reliability bugs in our offboard software and/or report them to our software engineering teams.
  • Help with infrastructure resource planning.
  • Guide systems architecture (build-out and improvements) by participating in design reviews and offering suggestions for reliability and optimization of production systems.

About You

  • Experience in a Site Reliability and/or Systems Engineer role in Linux / network operation.
  • Extensive distributed system development experience, can read source code of Nuro or third-party systems, modify them and improve performance and reliability.
  • Ideally you have experience in one or more of the open source software: Ceph, Kubernetes, Docker, message queue (e.g. Kafka, RabbitMq and Redis).
  • Distributed system performance debugging experience and familiarity with tools such as perf, ftrace, Zipkin, etc. would be a big plus.
  • You have solid coding skills preferably in Python and/or C++.
  • Understanding and hands-on experience with computer networking would be a huge plus.
  • You have keen systematic analysis and problem-solving skills.
  • You are an excellent teammate with strong communication and collaboration skills.
  • You are highly motivated and thrive in dynamic and fast-paced environments and a real passion for ensuring scalable and highly available systems on both on-prem and in the cloud.

Nuro is an equal opportunity employer and expressly prohibits any form of workplace harassment based on race, color, religion, gender, sexual orientation, gender identity or expression, national origin, age, genetic information, disability, or veteran status.

Read Full Job Description
Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.

Location

1300 Terra Bella Ave, Mountain View, CA 94043

Similar Jobs

Apply Now
By clicking Apply Now you agree to share your profile information with the hiring company.
Learn more about NuroFind similar jobs