Jump Trading Group is committed to world-class research. We empower exceptional talents in Mathematics, Physics, and Computer Science to seek scientific boundaries, push through them, and apply cutting-edge research to global financial markets. Our culture is unique. Constant innovation requires fearlessness, creativity, intellectual honesty, and a relentless competitive streak. We believe in winning together and unlocking individual talent through collaboration and mutual respect. At Jump, research outcomes drive more than superior risk-adjusted returns; we design, develop, and deploy technologies that change our world, fund start-ups across industries, and partner with leading global research organizations and universities to solve problems.We are looking for an adaptable, hands-on individual, passionate about managing Linux HPC environments at scale, and eager to tackle complex operational work as their primary focus.What You''ll Do:Provide front-line operational support for 24/7 Linux HPC compute, storage, and interconnects, involving technologies such as RDMA fabrics, parallel filesystems, HPC batch schedulers, FUSE filesystems, internal Jump software, multi-vendor hardware, cybersecurity requirements, and high user expectations.Solve problem reports and questions from Jump''s research community, managing the entire problem lifecycle and escalating issues as needed.Respond promptly to alerts.Participate in large, coordinated maintenance operations, including evenings and weekends.Work on global infrastructure projects.Write code for diagnosing, resolving, and automating tasks related to system problems.Collaborate across teams to develop and test code in multiple programming languages.Manage relationships with vendors, including travel for meetings.Implement and support performance and fault monitoring systems.Develop and maintain documentation for systems and users.Monitor tools used for maintaining the computing environment.Provide operational support as a primary responsibility.Follow all cybersecurity and IT policies, using only approved hardware and software.Participate in an on-call rotation.Perform other tasks as needed.Work from the company office an average of 5 days a week.Be willing to work during maintenance windows on a rotational basis, including Friday evenings or Saturday mornings.Skills You''ll Need:A strong interest in operational work as your primary role.At least 2+ years of experience with Linux systems.Experience with HPC technologies such as parallel filesystems, batch systems, and high-performance networks is a plus but not mandatory.Proficiency in at least one programming or scripting language (e.g., Go, Python, C) and quick adaptability to others.Ability to perform root cause analysis.Excellent verbal and written communication skills.Strong collaboration skills and willingness to handle various technologies.Ability to manage complex projects independently.A strong sense of urgency.Willingness to perform operational maintenance during evenings and weekends.Ability to work effectively in a busy office environment.Reliable and predictable availability. #J-18808-Ljbffr
Job Title
HPC Linux Operations Engineer