Job Description
- HPC Systems Engineer I
- Location: Cincinnati, OH (hybrid model)
- Employment Type: Direct Hire
- Start Date: ASAP
Overview
This is a direct-hire, Cincinnati-based role supporting high-performance computing (HPC), GPU, and research data platforms used across genomics, AI/ML, and large-scale data analysis. The role is well-suited for an early-career systems engineer looking to grow in a mission-driven, research-focused environment.
Key Responsibilities
HPC & Linux Systems Support
- Support day-to-day operations of HPC clusters, including:
- User and account management
- Job scheduler support and basic troubleshooting
- Resource monitoring and utilization
- Building and maintaining software modules
- Troubleshoot batch job failures and assist researchers with execution issues
- Support GPU and AI workloads as part of a modern research computing environment
System Engineering & Operations
- Assist with analysis, design, implementation, and maintenance of Linux-based systems (primarily RHEL)
- Participate in system testing, validation, and documentation
- Contribute to platform reliability, performance, and scalability improvements
- Follow change management and operational procedures
Technical & End-User Support
- Provide responsive technical support to researchers and internal teams
- Communicate clearly via email and incident management tools
- Monitor systems and assist with identifying future enhancement opportunities
- Participate in on-call rotations as needed
Collaboration & Growth
- Work closely with researchers, engineers, and IT partners
- Contribute to ongoing modernization initiatives including AI and GPU platform evolution
- Maintain accurate technical documentation
Required Qualifications
- Bachelor’s degree in a related field, or equivalent experience
- Foundational Linux experience
- Exposure to HPC environments, including:
- User/account creation
- Batch job systems
- Basic job troubleshooting
- Software module builds
- Strong written and verbal communication skills
- Willingness to learn and grow in a research computing environment
Qualifications
- Experience with HPC schedulers such as Slurm or LSF
- Experience imaging and building RHEL-based servers
- Familiarity with configuration management tools (Ansible, Puppet, Satellite, or similar)
- Experience with LDAP, DNS, Apache, networking, storage, and Linux system logging
- Basic server and data center hardware knowledge
- Experience in healthcare, research, or academic computing environments
How Success Is Measured
- Timely and effective resolution of HPC and system issues
- Stability and performance of supported platforms
- Ability to ramp quickly and work independently
- Quality of communication with researchers and internal teams
- Positive feedback from users and leadership