OXIO Logo

OXIO

Site Reliability Engineer

Posted 6 Days Ago
Remote
Hiring Remotely in USA
Mid level
Remote
Hiring Remotely in USA
Mid level
As a Site Reliability Engineer, you will design cloud platforms, automate operations, maintain infrastructure, and support engineering teams in delivering reliable services.
The summary above was generated by AI

Site Reliability Engineer
OXIO is the first NeoTelco. We are building the world’s largest, most accessible, and insightful Telecom network. Our platform empowers anyone to spin up their own carrier from a browser, scaling and supporting you as you scale your network to millions of users.

We ensure that users and devices are connected, and stay connected wherever they go: Cross- country, carrier, or cellular technology. We help them pay less for mobile data. This technology is provided through our Carrier-as-a-Service platform: BrandVNO, a fully customizable telecom service. In addition, we enable clients of our service to extract the value from telecom data - enriching their customer experience, business intelligence, and product understanding in the many markets in which we operate.

Come join us in creating a modern technology platform with a group of engineers dedicated to advancing our vision. Our team is passionate about what we build, open to new ideas and challenges, and has our sights set on the future of connectivity.

Responsibilities

  • Design and implement platform on the cloud to support OXIO backend services

  • Automate technical operations: deployments, scaling, recovery, etc.

  • Monitor and maintain mission-critical production infrastructure to ensure maximum uptime

  • Participate in an on-call rotation and culture of continuous improvement through blameless postmortems

  • Enable the Engineering/Telecom/Data Engineering teams by providing them the tools to operate the service they build

Essentials
  • Understanding of Linux/Unix systems (most systems are Linux-based).

  • Familiarity with Linux/Unix system internals like process management, filesystems, memory management, and networking.

  • Proficiency in at least one programming language (Python, Go, or Ruby) and strong skills in scripting (Bash, Perl).

  • Experience with infrastructure provisioning tools such as Terraform, CloudFormation, or Ansible.

  • Familiarity with containerization (Docker) and orchestration tools (Kubernetes).

  • Familiarity with monitoring tools like Prometheus, Grafana, or Datadog.

  • Knowledge of setting up alerts, analyzing logs, and creating dashboards for observability.

  • Familiarity with incident management practices (e.g., runbooks, postmortems).

  • Experience in being part of an on-call rotation and handling incidents.

  • Experience in setting up and maintaining Continuous Integration/Continuous Delivery pipelines (Jenkins, GitLab CI, CircleCI, etc.).

  • Hands-on experience with cloud providers (AWS, Google Cloud, Azure).

  • Knowledge of virtualization technologies (VMware, KVM) and cloud-native architecture.

  • Understanding of TCP/IP, DNS, HTTP/HTTPS, load balancing, and firewalls.

Nice to have
  • Strong understanding of deployment strategies (canary releases, blue-green deployments, etc.).

  • Familiarity with high availability and understanding failover mechanisms.

  • Familiarity with IAM (Identity and Access Management) and zero trust principles.

  • Experience working with distributed systems (e.g., Kafka, Cassandra, Elasticsearch).

  • Building custom monitoring tools or writing complex automation scripts.

  • Functional knowledge of database management (SQL and NoSQL).

  • Familiarity with distributed tracing (Jaeger, OpenTelemetry) and advanced log aggregation strategies (ELK stack, Splunk).

  • Familiarity with performance profiling tools and optimizing application performance under heavy load.

  • Familiarity in load testing and identifying bottlenecks.

  • Familiarity with Configuration Managment using SaltStack for maintaining server configurations.

Similar Jobs

3 Days Ago
Remote or Hybrid
Expert/Leader
Expert/Leader
AdTech • Cloud • Digital Media • Information Technology • News + Entertainment • App development
The Staff Site Reliability Engineer is responsible for ensuring the reliability, performance, and security of workplace collaboration services, focusing on automation, incident management, and operational excellence while providing technical leadership and mentoring to engineers.
Top Skills: Ai EngineeringAzure Virtual DesktopDefender For Office 365Exchange OnlineGraph ApiIntuneJamf ProMicrosoft 365Microsoft Entra IdMicrosoft PurviewOnedrivePowershellSharepoint OnlineTeams
5 Days Ago
Remote or Hybrid
Santa Clara, CA, USA
166K-290K Annually
Senior level
166K-290K Annually
Senior level
Artificial Intelligence • Cloud • HR Tech • Information Technology • Productivity • Software • Automation
The Sr Staff Site Reliability Engineer will lead infrastructure projects, design scalable solutions, and collaborate across teams while providing technical support and mentorship.
Top Skills: AWSBashDatadogGitopsGoGrafanaHelmKubernetesLinuxPrometheusPythonTerraform
6 Days Ago
Remote or Hybrid
2 Locations
160K-255K Annually
Senior level
160K-255K Annually
Senior level
Artificial Intelligence • Healthtech • Logistics • Social Impact • Software • Telehealth
The Staff Site Reliability Engineer at Sprinter Health will enhance the reliability and security of cloud infrastructure, automate processes, and improve system observability across healthcare delivery operations.
Top Skills: Access ManagementAWSBashCi/Cd SystemsCloud NetworkingContainer SystemsGCPIdentity ManagementLogging PlatformsMonitoring PlatformsObservability PlatformsPythonSecrets ManagementTerraformTypescript

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account