Job DescriptionJob DescriptionBenefits:
- 401(k)
- 401(k) matching
- Dental insurance
- Health insurance
- Paid time off
- Training & development
- Tuition assistance
- Vision insurance
OVERVIEW:
SILTT is searching for a dynamic and highly motivated Network Operations Center (NOC) Incident Manager to provide leadership and coordination for all major incidents affecting network and service availability within a 24x7x365 Network Operations Center. This role ensures timely detection, escalation, communication, and resolution of Priority 1 and Priority 2 events, driving continuous service restoration and post-incident learning. The Incident Manager serves as the command authority during critical outages, leading cross-functional bridge calls, ensuring accurate documentation, and managing stakeholder communications through all phases of the incident lifecycle. If you are excited by the opportunity to join our team as a NOC Incident Manager, we encourage you to apply today!
WHO WE ARE:
At SILTT were pushing the limits of infrastructure innovation in the Telecommunications and Information Technology industry. From delivering world-class modular data center facilities to all-hours, 365-day operational response and disaster recovery, our multi-functional team of experts are force multipliers across the infrastructure landscape. We pride ourselves in leading from the front to advise, assist, and accompany our clients through their toughest technological and operational challenges. We always deliver results (spelled re-SILTTs)!
WHY SILTT?
At SILTT, objective-driven means first being people-driven. As a small business we know that the ability to achieve our mission demands we take care of our own by providing our team members with a variety of benefits that allow them to live fulfilling, healthy, balanced, meaningful lives. Thats why we believe in offering paid healthcare, ultra-competitive 401K matching, accrued paid time off and fixed holiday leave, continuous learning and professional development incentives, and promote a sustainable work-life balance.
A CALL TO ACTION:
As we charge ahead in the competitive world of technology and sustainment, we need a strong NOC Incident Manager to support our current and future projects. This critical position will collaborate with fellow SILTT teammates, stakeholders and executive leadership. As we staff up to support a new program, this NOC Incident Manager will have the opportunity to be on the ground floor and help define the trajectory of our future!
A DAY IN THE LIFE:
In this role, you will support a high-impact Network Operations Center that forms the backbone of enterprise service delivery. This role provides the opportunity to modernize NOC best practices and shape a world-class operations culture that scales while maintaining customer trust and operational excellence. Key responsibilities include, but are not limited to:
- Major Incident Management:
- Act as Incident Commander for all Priority 1 and Priority 2 events, maintaining operational control and clear communication during service disruptions.
- Lead bridge calls, coordinate restoration efforts, and ensure technical and management stakeholders remain aligned.
- Track Mean Time to Acknowledge (MTTA) and Mean Time to Restore (MTTR) performance; ensure SLAs are consistently met.
- Coordinate escalation to L2/L3 engineering, vendors, or facilities teams to accelerate fault isolation and recovery.
- Provide continuous updates to customers and executives during major incidents.
2. Post-Incident Review and Continuous Improvement:
- Conduct Post-Incident Reviews (PIRs) within defined SLA windows (e.g., P1 < 5 business days).
- Document root cause, contributing factors, and corrective actions; ensure lessons learned are captured in runbooks.
- Partner with Problem Management to identify chronic issues and implement long-term remediation.
- Drive continuous improvement initiatives to reduce incident recurrence and improve restoration times.
3. Process and Governance:
- Maintain adherence to ITIL Incident and Problem Management processes.
- Refine triage workflows, escalation matrices, and bridge playbooks.
- Contribute to weekly MBR and monthly QBR reports, summarizing incident trends, root causes, and performance metrics.
- Collaborate with Change and QA teams to improve system reliability and reduce change-induced incidents.
4. Leadership and Training:
- Mentor Level 1 and Level 2 staff in escalation handling, communication, and technical triage.
- Provide feedback and coaching on incident response quality and communication etiquette.
- Participate in training and drills (e.g., mock P1 scenarios, disaster recovery table-top exercises).
BASIC QUALIFICATIONS:
- Bachelors degree in Information Technology, Computer Science, or related field.
- 10+ years of network operations or service delivery experience, including 5+ years in incident management or major incident coordination.
- Proven experience in ITIL-based environments with strong knowledge of incident, problem, and change management.
- Familiarity with network infrastructure, monitoring systems, and ITSM tools (ServiceNow, SolarWinds, Splunk, LogicMonitor, etc.).
- Exceptional communication, analytical, and crisis management skills.
- ITIL v4 Foundation or Intermediate certification ; PMP or equivalent operational certification desirable.
SKILLS & COMPETENCIES:
- Major Incident Command and Decision-Making
- Stakeholder Communication and Executive Updates
- Cross-Functional Coordination (NOC, Engineering, Facilities, Security)
- Root Cause and Post Incident Review Execution
- ITIL Governance and Continuous Improvement
- Calm Under Pressure / Situational Awareness
WORK LOCATION:
The work associated with this role is expected to be performed on-site at our San Diego Area location, though some travel may be required for periodic support as needed.