Skip to Main Content

Job Title


InfiniBand Engineer (High-Performance Networking)


Company : Aptly Technology Corporation


Location : Belgaum, Karnataka


Created : 2026-02-18


Job Type : Full Time


Job Description

Job SummaryWe are seeking a highly skilled InfiniBand Engineer with strong expertise in advanced networking technologies to design, deploy, and support high-performance, low-latency network infrastructures. The ideal candidate will have hands-on experience with InfiniBand fabrics, data center networking, and large-scale distributed computing environments (HPC / AI / ML clusters).Key ResponsibilitiesDesign, implement, and manage large-scale InfiniBand (IB) fabrics in data center and HPC environments.Configure and troubleshoot InfiniBand switches and adapters (e.g., Mellanox / NVIDIA IB platforms).Perform fabric bring-up, subnet management (OpenSM), partitioning, and performance tuning.Monitor and optimize network performance, latency, throughput, and congestion control.Integrate InfiniBand with Ethernet-based networking environments.Support RDMA technologies (RoCE, iWARP) and GPUDirect environments.Collaborate with system, storage, and compute teams to support AI/ML and distributed workloads.Perform firmware upgrades, patching, and capacity planning.Troubleshoot Layer 2 / Layer 3 networking issues (BGP, OSPF, VLAN, VXLAN, etc.).Maintain documentation, network diagrams, and SOPs.Required Skills & Qualifications5+ years of networking experience with strong fundamentals (TCP/IP, routing, switching).Hands-on experience with InfiniBand technologies (HDR/NDR preferred).Experience with NVIDIA / Mellanox Technologies switches and adapters.Strong understanding of RDMA, congestion control, QoS, and low-latency tuning.Experience with subnet managers (OpenSM) and fabric diagnostic tools.Solid understanding of BGP, OSPF, EVPN-VXLAN, MPLS (good to have).Experience in HPC, AI/ML cluster networking environments is highly preferred.Familiarity with Linux networking and troubleshooting tools.Experience with automation (Python, Ansible) is a plus.Preferred QualificationsExperience supporting large GPU clusters.Knowledge of storage networking (NVMe-oF, parallel file systems).Experience with monitoring tools and telemetry systems.Networking certifications (CCNP/CCIE or equivalent).Key CompetenciesStrong analytical and troubleshooting skillsAbility to work in high-performance, mission-critical environmentsExcellent documentation and communication skillsProactive problem-solving mindset