Skip to main content

Director Of Platform Engineering & Operations

Job Description

Description

Role Summary

The Director of Platform Engineering & Operations is responsible for the company's entire technology platform—overseeing both customer-facing systems and internal infrastructure to ensure 24x7 availability, security, and scalability across Azure cloud and on-premise environments. This hands-on leadership role balances technical execution and strategic management, including building a high-performing team, driving operational excellence, implementing security controls, and supporting the company’s rapid growth.

Key Responsibilities

Leadership & Strategy

Build, mentor, and retain a team of 8 engineers across infrastructure/network, DevOps/SRE, and desktop/end-user support, providing technical coaching, career development, and performance management.

Own platform strategy, roadmap, and execution to meet business goals and customer SLAs

Define and track operational KPIs (availability, MTTR, change success rate, incident volume, cloud cost efficiency) and present regular updates to the CTO and executive team.

Take full ownership of platform strategy, roadmap, and execution aligned with business objectives, product needs, and customer SLAs.

Establish operational cadence: incident reviews, change advisory board, service desk metrics, team retrospectives, and continuous improvement culture.

Platform Operations & Architecture

Own the design, implementation, and 24x7 operation of company's hybrid infrastructure (Azure + on-premise) supporting both production and internal corporate systems.

Ensure high availability, scalability, performance, security, and cost efficiency across all environments.

Hands-on architecture and implementation of cloud infrastructure, networking, management (Azure AD/Entra, RBAC), storage, backup, monitoring, and observability.

Drive cloud optimization initiatives: rightsizing, reserved capacity, architectural improvements, and cost governance across Azure workloads.

Define and enforce platform standards for networking, security, , logging, alerting, and operational discipline.

DevOps & Site Reliability Engineering

Lead DevOps and SRE transformation: implement CI/CD pipelines, Infrastructure as Code (Terraform, ARM/Bicep), containerization (Kubernetes), and modern deployment practices

Hands-on implementation of Kubernetes clusters, container orchestration, service mesh, and cloud- architecture patterns

Establish SRE principles: error budgets, SLOs/SLIs, blameless postman reduce deployment risk, and increase developer productivity

Implement robust change management processes (risk assessment, testing, communication, rollback procedures) that balance speed, safety, and audit readiness

Information Security & Compliance

Implement security and compliance controls, including access management, logging and monitoring, vulnerability management, incident response, and audit evidence collection.

Establish security best practices across infrastructure: network segmentation, firewall rules, encryption (data at rest/in transit), secrets management, privileged access management.

Lead incident response for infrastructure and platform issues, including root cause analysis, remediation, and process improvements.

Own Disaster Recovery strategy and execution: define RPO/RTO targets, architect multi-region and hybrid DR solutions, develop runbooks, and conduct regular DR testing

Ensure backup and restore capabilities across all critical systems with documented procedures and validated recovery processes

Desktop & End-User Support

Oversee desktop, endpoint, and telecom services (laptops, mobile devices, productivity tools, collaboration platforms, voice/conferencing) to deliver reliable, secure employee experiences

Implement IT service management practices (incident, request, problem, asset management) with clear SLAs and user satisfaction metrics

Manage vendor relationships across infrastructure, telecom, SaaS, and managed services—evaluate contracts, optimize licensing, and ensure service quality

The individual needs to be hands-on, able to manage multiple data centers (including servers, VMware, Cisco, telecom, firewalls, load balancers), and manage hybrid cloud environments (ideally Azure, but AWS/GCP experience is acceptable).

They should have experience with governance, cost optimization, vendor management, and security management for cloud platforms.

The candidate must be able to communicate and work closely with the DevOps team, make architecture design decisions, and be a strong partner with the app dev team.

Experience with containers (Kubernetes), infrastructure as code, and a strong focus on security are also important.

They need to be able to interact with everyone from help desk staff to co-founders.

Skills

devops, data center operations, cloud, hybrid, fleet management, itsm, SOC 2, azure, kubernetes

Top Skills Details

devops,data center operations,cloud,hybrid

Additional Skills & Qualifications

Skills

Experience in B2B SaaS, telematics, fleet management, IoT, or other real-time, data-intensive platforms serving enterprise customers

Familiarity with ITSM tools (Jira Service Management, ServiceNow), configuration management databases (CMDB), and IT asset management practices

Experience with observability and monitoring platforms (Datadog, New Relic, Prometheus/Grafana, Azure Monitor, Application Insights)

supporting real-time GPS tracking, vehicle telematics, or IoT device management platforms

Relevant certifications: Microsoft Certified: Azure Solutions Architect Expert, Azure Administrator Associate, CISSP, CISM, ITIL Foundation or higher

Experience scaling infrastructure to support rapid business growth (2x–3x revenue in 2–3 years)

Prior experience operating in regulated or compliance-driven environments (SOC 2, ISO 27001, HIPAA, FedRAMP)

Hands-on experience with Azure Kubernetes Service (AKS), Azure DevOps, GitHub Actions, or similar CI/CD platforms

Understanding of fleet management industry compliance requirements (FMCSA, ELD mandates, hours-of-service regulations)

Experience Level

Expert Leve

Job Type & Location

This is a Permanent position based out of Charlotte, NC.

Pay and Benefits

The pay range for this position is $200000.00 - $240000.00/yr.

Employee Value Proposition (EVP)\n*Free breakfast and lunch at their cafeteria, on-site childcare and gym, and pet friendly*, new AI video analytics tool, they NEED an AI engineer, The VTrack Vision team builds GPU accelerated video analytics for real time safety monitoring across large fleets and industrial environments. Our system processes high volume video streams, runs YOLO based detection models, performs temporal tracking and smoothing to reduce false positives, and identifies actionable safety violations. Inference results are published to downstream APIs and integrated with Azure Event Hub, Blob Storage, and cloud monitoring systems.\nIf you enjoy pushing GPU performance limits, crafting resilient ML pipelines, and building real world safety applications that make an impact, you’ll fit right in.\n- wanting to hire FTE\n- open to contract if necessary

Workplace Type

This is a fully onsite position in Charlotte,NC.

Application Deadline

This position is anticipated to close on Apr 8, 2026.

h4>About TEKsystems:\n\n

We're partners in transformation. We help clients activate ideas and solutions to take advantage of a new world of opportunity. We are a team of 80,000 strong, working with over 6,000 clients, including 80% of the Fortune 500, across North America, Europe and Asia. As an industry leader in Full-Stack Technology Services, Talent Services, and real-world application, we work with progressive leaders to drive change. That's the power of true partnership. TEKsystems is an Allegis Group company.

\n\n

The company is an equal opportunity employer and will consider all applications without regards to , sex, , , , , veteran status, , , , genetic information or any characteristic protected by law.

\n\nAbout TEKsystems and TEKsystems Global Services \n\n

We’re a leading provider of business and technology services. We accelerate business transformation for our customers. Our expertise in strategy, design, execution and operations unlocks business value through a range of solutions. We’re a team of 80,000 strong, working with over 6,000 customers, including 80% of the Fortune 500 across North America, Europe and Asia, who partner with us for our scale, full-stack capabilities and speed. We’re strategic thinkers, hands-on collaborators, helping customers capitalize on change and master the momentum of technology. We’re building tomorrow by delivering business outcomes and making positive impacts in our global communities. TEKsystems and TEKsystems Global Services are Allegis Group companies. Learn more at TEKsystems.com.

\n\n

The company is an equal opportunity employer and will consider all applications without regard to , sex, , , , , veteran status, , , , genetic information or any characteristic protected by law.

Director Of Platform Engineering & Operations

Charlotte, NC
Full time

Published on 04/07/2026

Share this job now