Skip to Main Content

Job Title


Site Reliability Engineer


Company : AQUMEN LABS


Location : Bengaluru, Karnataka


Created : 2025-12-18


Job Type : Full Time


Job Description

Company Description AQUMEN Labs is a trustedQuality Engineering, DevOps, and AI Consultingpartner to several industry leaders and high-growth startup companies across India and global markets. We actively engage with the startup ecosystem to help teams build resilient, scalable, and production-ready platforms by applying lessons learned from working with companies that have successfully scaled from early-stage ventures to global enterprises. Our customers value theengineering-first, automation-driven approachthat AQUMEN Labs brings to their technology organizations. By embedding modern quality practices, cloud-native DevOps, and AI-led insights into delivery pipelines, we help teams achieve faster releases, higher reliability, zero-touch deployments, and measurable business outcomes—while remaining cost-efficient and operationally lean. Our clientele spans India, the USA, the UK, the Middle East, Australia, and Southeast Asia. We thrive on solving complex engineering challenges across modern technology stacks, includingcloud-native platforms, distributed microservices, CI/CD and GitOps, observability and SRE, AI/ML systems, data platforms, enterprise applications, and large-scale platform implementations . Our expertise extends across consumer-centric and regulated industries such aseCommerce, Payments & FinTech, Banking, Retail, Media, Healthcare, SaaS, and Digital Platforms . At AQUMEN Labs, we combinequality engineering, DevOps excellence, and AI-driven intelligenceto help organizations build software that is not only functional—but reliable, scalable, and future-ready.Role Description This is a consulting, on-site role for a Site Reliability Engineer located in Bengaluru. The Site Reliability Engineer will be responsible for ensuring the reliability, scalability, and performance of the company’s systems and services. Day-to-day responsibilities include designing and implementing reliable infrastructure, monitoring system performance, identifying and resolving incidents, and automating processes to minimise manual intervention. The engineer will collaborate with development and operations teams to optimise system performance while maintaining high availability and security.Qualifications Proficiency in Cloud Infrastructure Management and Container Orchestration (e.g., AWS, Azure, GCP, Kubernetes, Docker) Skill in Monitoring, Logging, and Incident Management tools (e.g., Prometheus, Grafana, ELK Stack, PagerDuty) Strong knowledge of Programming and Scripting languages (e.g., Python, Go, Shell scripting) Experience with Configuration Management and Automation tools (e.g., Terraform, Ansible, Puppet, Chef) Understanding of Networking and Systems Architecture (e.g., DNS, load balancing, clustering, performance tuning) Knowledge of CI/CD pipelines and version control systems (e.g., Jenkins, Git, GitHub) Experience with database and storage systems (e.g., MySQL, PostgreSQL, NoSQL databases) Problem-solving skills with a focus on root cause analysis and continuous improvement A bachelor's degree in Computer Science, Software Engineering, or related field, or equivalent work experience Previous experience in a similar role or DevOps is a plus