Role: GPU Compute Architect As a GPU Compute Architect you will design, build and maintain the foundational systems and distributed infrastructure that power AI model post training, inference, and data pipelines. You will collaborate with engineering teams to ensure performance, scalability, and reliability of critical AI systems. You will be a member of a core team of incredibly talented industry specialists and will work with the very latest GPU compute hardware and software technology. Core Responsibilities1. Mandatory Skill sets: Architectural Design:Understanding GPU Architecture ( such as GPU Processing Unit, Graphics Processing Clusters( GPCs), Streaming Multiprocessors(SM), WARP, CUDA cores, Tensor cores, Ray tracing cores, GPU Interconnects ( NVLINK interfaces, NVLINK switch chips) , NCCL primitives) Concepts of SMIT, SMID End-to-End AI Infra Design : Design end-to-end AI infrastructure solutions( based on Sizing of Compute, Network , Storage for given Infra power, sustained FLOPs, or AI training parameters) for private and public Cloud systems. focusing on Superclusters that leverage NVIDIA H200/B300/GB300 or AMD accelerators. Develop GPU Infra Best practices guidelines. Proof of Concept (PoC):Lead deep-dive technical evaluations, demonstrating Aptly’s superior price-performance ratios for model training and fine-tuning. Stack Integration:Assist customers in deploying and optimizing theAI Enterprisestack and also GPU incident support GPU Benchmarking & Performance Tuning/Monitoring : Profile workloads (LLaMA3, Rodinia, 3DMark) with Nsight/DCGM; triage bottlenecks—kernel inefficiencies, NCCL ring imbalances, or NVMe IOPS saturation . Tune for higher Streaming Multiprocessor occupancy, correlating with Prometheus alerts (e.g., temp throttling rules).Qualifications: 10 or more years’ experience in building large-scale GPU compute AI infrastructures distributed systems for Private and Public Cloud Platforms. Educational Qualifications: Bachelors or Master’s in Computer Science Engineering, IT, Electrical Engineering, or Equivalent engineering degree. Soft Skills : Strong analytical and problem-solving skills, customer handling, Adaptability, effective communication, collaboration, time management, independent thinking and good team player. Certifications preferably: NVIDIA-Certified Professional: AI Infrastructure (NCP-AII), ITIL V4/v5 , AZ-305: Designing Microsoft Azure Infrastructure Solutions.
Job Title
GPU Architect