Job Title: HPC (High-Performance Computing) System Admin Location: 100% REMOTE - IndiaEmployment Type: Contrct to hire role - 6 to 9 Months Contract Capacity and DurationThis role is not full-time. We are looking for a candidate with 50% capacity (20 hours per week) for a duration of six months.Daily Schedule and ExpectationsWe are looking for a consistent daily presence rather than fragmented hours. The ideal breakdown for their 4-hour workday is as follows:Monitoring system health and addressing urgent incidents.Bright Cluster ManagerSlurm queue status and stuck jobsAuthentication system statusHardware alerts and BMC notificationsTicket ReviewTriage, prioritization, and initial response to the service desk queue.Project & Escalation (Remaining daily time): Addressing escalated issues and planning future system improvements.Time Zone and CommunicationThe candidate must have a significant overlap with US Eastern Time (EST). It is important to note that the team's technical expert is based in EST, and frequent interaction will be required.We are open to candidates in the European time zone, provided they can maintain the necessary EST overlap.Because this role involves critical system stability and coordination, strong English communication skills are a requirement.Job Overview:We are seeking an experienced HPC System Administrator with hands-on expertise in Bright Cluster Manager, Slurm, Linux environments, and HPC command-line operations. This role involves supporting and maintaining existing production HPC clusters, ensuring stable performance, resolving hardware issues, and assisting users to keep computational workflows running efficiently.Key Responsibilities & Skills:Bright Cluster Manager: Proficient in administering and monitoring clusters through CMSH, managing system images, and maintaining cluster configurations.Slurm: Solid understanding of scheduler configuration, handling job prioritization, creating policy exceptions, and managing reservations.Linux Administration: Strong background in system management, troubleshooting, and providing technical support to users.Hardware Diagnostics: Ability to identify hardware faults, perform basic server-level troubleshooting, and pinpoint failing components.BMC/Remote Management: Familiarity with Dell iDRAC, HPE iLOM, and Supermicro management interfaces.Thanks & Best RegardsPiyush SharmaRecruitmenteMail: Psharma@ | 7014 East Camelback Road, Suite 1452Scottsdale, Arizona 85251
Job Title
HPC (High-Performance Computing) System Admin with strong Slurm, Bright Cluster Manager, HPC Command Line experience - 100% REMOTE - EST Work Shifts -