Skip to Main Content

Job Title


Senior MLOps Engineer


Company : Quantiphi


Location : Bengaluru, Karnataka


Created : 2025-07-24


Job Type : Full Time


Job Description

Role : Senior Platform Engineer (MLOps / LLMOps) Experience : 3 to 6 Years Location : Bangalore / Mumbai / Trivandrum (Hybrid) Job Summary: Join our dynamic team as a Platform Engineer and leverage your expertise in production-scale platforms within the GenAI or ML domain . In this role, you'll be instrumental in developing and maintaining cutting-edge build and test environments for critical GenAI workloads running on foundational cloud infrastructure. You'll partner with architects to design and implement highly robust and scalable systems, while also providing crucial development support to SRE/Operations teams as they tackle complex distributed systems challenges at scale. We're seeking an engineer who champions Quantiphi's dedication to Cloud-Native development , with a particular emphasis on Kubernetes . Job Responsibilities: As a Platform Engineer , you will play a pivotal role in designing, implementing, and optimizing our cutting-edge infrastructure. Your responsibilities will include: Implementing state-of-the-art GPU compute clusters to support critical workloads. Developing comprehensive automated testing strategies and frameworks across unit, integration, API, and end-to-end levels for critical commerce flows. Ability to create robust performance testing frameworks to validate platform scalability, resilience, and identify optimization opportunities. Experience in developing comprehensive monitoring solutions with alerting systems to track platform health and ensure SLA compliance. Building a scalable automation infrastructure that supports growing platform capabilities with consistent test environments. Troubleshooting, diagnosing, and performing root cause analysis of system failures, isolating components and failure scenarios in collaboration with internal and external partners. Optimizing cluster operations for maximum reliability, efficiency, and performance. Job Requirements: We are seeking a highly skilled and passionate Platform Engineer with: Over 3 years of hands-on experience in large-scale direct experience building and deploying production-ready services on Kubernetes. A proven history of engaging with and contributing to open-source projects . A collaborative spirit , demonstrated by prior work developing scalable software solutions for cloud services. The ability to effectively communicate complex technical designs and decisions with internal team. An understanding of GPU computing and AI infrastructure . A strong passion for solving complex technical challenges and optimizing system performance. Working knowledge of cluster configuration management tools such as BCM or Ansible, and infrastructure-level applications including Kubernetes, Terraform, and MySQL. In-depth understanding of container technologies like Docker and Containers. Proficiency in programming with Python and Bash scripting. Ways To Stand Out From The Crowd: Candidates who possess the following will be highly competitive: Significant experience with sophisticated infrastructure tooling , including Kubernetes Cluster API, Terraform, Helm, and Operator Framework. Practical, production-level experience across major cloud platforms : Azure Cloud, Google Cloud Platform (GCP), or Amazon Web Services (AWS). A strong track record of successfully refactoring and optimizing software for deployment within Kubernetes environments . Understanding of the CNCF landscape and its associated tooling. The ability to decompose complex problems into simpler sub-problems and leverage existing solutions for efficient implementation, along with designing simple, self-sustaining systems. Experience leveraging AI/ML to proactively detect and resolve incidents , automate alert triaging, perform log analysis, and streamline repetitive workflows.