Job Description
Job Summary:
System Administrator is responsible for supporting day-to-day server operations, installations, maintenance, and troubleshooting within large-scale data center environments. This role involves managing hardware deployments, ensuring optimal server performance, and providing first-level technical support for server-related incidents.
Key Responsibilities:
Server Deployment & Capacity Planning
- Plan and organize server installations onto racks, ensuring proper mounting and arrangement.
- Develop and implement capacity management strategies for large-scale data centers to optimize resource utilization.
- Gather server requirements and ensure adequate provision of space, power, and rack capacity.
Server Installation & Configuration
- Install and configure operating systems for newly mounted servers.
- Reinstall server operating systems and resolve related abnormalities.
Maintenance & Troubleshooting
- Perform daily server maintenance, troubleshooting, and repair activities.
- Follow up on break-fix incidents and ensure timely resolution.
- Collaborate with remote vendors or other technical teams to resolve hardware batch failures and issues.
- Conduct server network troubleshooting and general system diagnostics.
Asset & Lifecycle Management
- Maintain accurate data on internal systems including asset management, ticketing, and rack-related records.
- Collect, verify, and monitor online asset status and issues.
- Manage server lifecycle activities including drive erasure, hardware retirement, and relocation.
- Submit and track part RMAs or media destruction requests.
Operations Support & Escalation
- Provide on-call support for issues raised by business stakeholders.
- Assist with retrofitting and testing feedback for tools, systems, and platforms.
- Escalate complex technical issues to Senior SOE engineers when required.
Qualifications & Requirements:
- Education: Bachelor’s Degree or Diploma in Computer Science, Electrical Engineering, or a related field.
Technical Skills:
- Strong understanding of server operations and hardware components.
- Proficiency with Linux systems for troubleshooting server software/hardware issues.
- Basic knowledge of network concepts including MAC, Subnet, and TCP/IP.
- Familiarity with out-of-band/lights-out server management (e.g., IPMI).
- Ability to read, understand, and run simple Shell/Bash scripts.
Soft Skills:
- Strong problem-solving and analytical abilities.
- Good communication skills in English and ability to work effectively in a team.
- High sense of responsibility, enthusiasm for technical challenges, and ability to work under pressure.
- Capable of working independently with minimal supervision.
Skillset Summary:
- Server Operations & Maintenance
- Linux OS Troubleshooting
- Hardware Installation & Lifecycle Management
- Network Fundamentals (TCP/IP, MAC, Subnet)
- Scripting (Shell/Bash)
- Asset Management & Ticketing Systems
- IPMI and Remote Server Management