Astreya Logo

Astreya

Incident Response Analyst II

Posted 2 Days Ago
Be an Early Applicant
In-Office
San Jose, CA, USA
73K-115K Annually
Mid level
In-Office
San Jose, CA, USA
73K-115K Annually
Mid level
Monitor alerts across data centers, cloud, and facilities; triage and respond to incidents within SLA timelines; act as incident commander during active events; create and maintain tickets, runbooks, and PIRs; coordinate global handovers and collaborate with engineers, vendors, and security teams to resolve and document incidents.
The summary above was generated by AI

We are seeking a dynamic and proactive Incident Response Analyst (IRA) to join our global operations team. This role is critical to maintaining operational integrity across data centers, cloud services, and regional facilities. The ideal candidate will be experienced in realtime monitoring, incident response, and collaborative coordination, with the ability to manage both IT infrastructure incidents and facility/environmental alerts. The IRA must consistently meet strict Service Level Agreement (SLA) timelines to ensure rapid detection, response, and resolution of incidents.


Key Responsibilities  Monitoring & Detection

• Continuously monitor for alerts and alarms across:

• IT infrastructure: Server performance issues, network outages, system failures.

• Environmental alerts relevant to operations

• Response to Cloud based alerts/alarms.

• External Facing Services: Incoming emails, including colocation maintenance notices, service requests from CDN partners, and other critical notifications

• Responsible for initial assessment, containment, and mitigation of cloud infrastructure alerts and alarms

• Proven experience managing live infrastructure incidents across OCI/AWS/Azure/GCP within a 24/7 Operations Center.

• Serve as the first responder to all alerts and notifications—perform prompt triage, categorize severity, and initiate appropriate response actions in alignment with defined SLA timeframes.

• Create and manage alarm, incident and change tickets, ensuring documentation quality and strict adherence to SLA timelines.


Incident Response & Coordination


•Serve as the Incident Commander during active incidents, leading incident bridge calls and orchestrating response efforts in collaboration with internal teams, subject matter experts (SMEs), external vendors, and the Global Operations Center (GOC).

• Facilitate global shift handovers, ensuring seamless communication and issue tracking between regions.

• Collaborate closely with data center operators, network engineers, security personnel, and other stakeholders.

• Ensure all incident response activities comply with strict SLA timelines for acknowledgment, escalation, and resolution.


Documentation & Reporting

• Produce accurate and timely incident reports, detailing:

• Executive summary and timeline

• Root cause (actual or potential)

• Business impact and remediation steps

• Draft Post-Incident Reports (PIRs) and assist in scheduling internal post-mortem reviews.

• Maintain up-to-date standard operating procedures (SOPs), runbooks, and incident handling documentation.

• Ensure documentation and reporting tasks are completed within SLA requirements. Operational Support & Process Improvement

• Support regional managers and program owners in maintaining operational excellence and enhancing processes


Basic Qualifications


• 3+ years of experience in a command center, NOC/FOC, or 24x7 operations environment.

• Proven ability to triage multiple concurrent incidents, with strong prioritization based on severity and risk.

• Familiarity with data center layouts, IP networking, servers, and LAN/WAN configurations.

• Experience with facility and environmental monitoring relevant to incident response. Proficiency with IT systems, high keyboard accuracy (minimum 25 WPM), and comfort using ticketing tools and monitoring platforms. •

• Strong communication skills—able to work independently, provide clear updates, and collaborate across global teams.

• Understanding of data protection regulations (e.g., GDPR) and how to manage sensitive information securely.

• Willingness to work on-site, in rotating shifts (including nights, weekends, and holidays) as part of a global support model.

• Demonstrated ability to consistently meet or exceed SLA timelines for incident management and resolution.
Preferred Qualifications

• Strong analytical and problem-solving skills—can perform under pressure and resolve incidents efficiently.

• Exposure to project coordination or process improvement initiatives.

• Relevant certificates in either Cloud, Server or Edge related work

Ability to work weekday or weekend work with possible shift rotation.

Salary Range

$72,960.00 - $115,200.00 USD (Salary)
  • Please note that the salary information provided herein is base pay only (gross); it does not include other forms of compensation which may or may not apply to this specific position, namely, performance-based bonuses, benefits-related payments, or other general incentives - none of which are guaranteed, may be subject to specific eligibility requirements, and are wholly within the discretion of Astreya to remit.
  • Further, the salary information noted above is a range that consists of a minimum and maximum rate of pay for this specific position. Where an applicant or employee is placed on this range will depend and be contingent on objective, documented work-related considerations like education, experience, certifications, licenses, preferred qualifications, among other factors.

Astreya offers comprehensive benefits to all Regular, Full-Time Employees, including:

  • Medical provided through UHC (PPO, HSA, Surest options) / Medical provided through Kaiser (HMO option only) for California employees only

  • Dental provided through UHC

  • Nationwide Vision provided by UHC

  • Flexible Spending Account for Health & Dependent Care

  • Pre-Tax Account for Commuter Benefit/Parking & Transit (location-specific)

  • Continuing Education and Professional Development via various integrated platforms, e.g. Udemy and Coursera

  • Corporate Wellness Program provided by Goomi Group

  • Employee Assistance Program

  • Wellness Days

    401k Plan

  • Basic and Supplemental Life Insurance

  • Short Term & Long Term Disability

  • Critical Illness, Critical Hospital, and Voluntary Accident Insurance

  • Tuition Reimbursement (available 6 months after start date, capped)

  • Paid Time Off (accrued and prorated, maximum of 120 hours annually)

  • Paid Holidays

  • Any other statutory leaves, paid time, or other ancillary benefits required under state and federal law

HQ

Astreya San Jose, California, USA Office

2033 Gateway Pl, Ste 500, , , San Jose, California , United States, 95110

Astreya Campbell, California, USA Office

Campbell, United States

Astreya Cupertino, California, USA Office

Cupertino, United States

Astreya San Francisco, California, USA Office

655 Montgomery St, STE 490 DPT #17117, San Francisco, California, United States, 94111

Astreya Sunnyvale, California, USA Office

Sunnyvale, United States

Similar Jobs

2 Days Ago
In-Office
San Jose, CA, USA
73K-115K Annually
Mid level
73K-115K Annually
Mid level
Information Technology
Operate a 24/7 operations center to monitor IT, cloud, and facility/environmental alerts; triage and contain incidents across OCI/AWS/Azure/GCP; serve as Incident Commander during active incidents; create and manage tickets, SOPs, runbooks, and post-incident reports; coordinate global shift handovers and collaborate with data center, network, and security teams while meeting strict SLA timelines.
Top Skills: AWSAzureCdnFacility/Environmental MonitoringGCPIp NetworkingLanMonitoring PlatformsOciServersTicketing ToolsWan
2 Days Ago
In-Office
San Jose, CA, USA
73K-115K Annually
Mid level
73K-115K Annually
Mid level
Information Technology
Monitor alerts across data centers, cloud, and external services; triage and respond to incidents within SLA timelines; act as Incident Commander during active events; coordinate with engineers, vendors, and global teams; create and manage tickets, incident reports, SOPs, and post-incident reviews; support process improvements and regional handovers in a 24/7 operations model.
Top Skills: AWSAzureCdnCloud ServicesData CentersGCPIp NetworkingLanMonitoring PlatformsOciServersTicketing ToolsWan
21 Days Ago
In-Office
San Jose, CA, USA
73K-115K Annually
Junior
73K-115K Annually
Junior
Information Technology
Monitor and triage infrastructure, cloud, and physical-security alerts; lead incident lifecycle management and RCA; coordinate responders, maintain ticketing records, and follow SOPs, runbooks, and playbooks in a 24/7 on-site operations center.
Top Skills: AcsAvigilonAws CloudwatchAzure MonitorBashBmsCctvDcimDnsEverbridge Visual Command Center (Vcc)Gcp StackdriverGenetecGrafanaIamInternationalsosIp NetworksKubernetesLenelLoad BalancingPowershellPythonSaosServerlessTicketing SystemsVpc

What you need to know about the San Francisco Tech Scene

San Francisco and the surrounding Bay Area attracts more startup funding than any other region in the world. Home to Stanford University and UC Berkeley, leading VC firms and several of the world’s most valuable companies, the Bay Area is the place to go for anyone looking to make it big in the tech industry. That said, San Francisco has a lot to offer beyond technology thanks to a thriving art and music scene, excellent food and a short drive to several of the country’s most beautiful recreational areas.

Key Facts About San Francisco Tech

  • Number of Tech Workers: 365,500; 13.9% of overall workforce (2024 CompTIA survey)
  • Major Tech Employers: Google, Apple, Salesforce, Meta
  • Key Industries: Artificial intelligence, cloud computing, fintech, consumer technology, software
  • Funding Landscape: $50.5 billion in venture capital funding in 2024 (Pitchbook)
  • Notable Investors: Sequoia Capital, Andreessen Horowitz, Bessemer Venture Partners, Greylock Partners, Khosla Ventures, Kleiner Perkins
  • Research Centers and Universities: Stanford University; University of California, Berkeley; University of San Francisco; Santa Clara University; Ames Research Center; Center for AI Safety; California Institute for Regenerative Medicine

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account