Role: Site Reliability Engineer (SRE)Location: Hyderabad – Marriott OfficeWork Mode: 5 Days Work From OfficeExperience Required: 7+ YearsNotice Period: Immediate to 15 DaysBudget: Up to 25 LPAAbout the Rolewe are looking for a highly skilled Site Reliability Engineer (SRE) to manage and enhance the reliability, scalability, and performance of cloud-based production systems. The ideal candidate will have strong experience in AWS, automation, infrastructure as code, and monitoring tools to ensure highly available and resilient systems.Key Responsibilities:Design, implement, and maintain scalable and highly available infrastructure on AWS.Automate infrastructure provisioning and configuration using Terraform and Ansible.Develop automation scripts using Python and Bash for operational efficiency.Deploy, manage, and optimize containerized workloads using Kubernetes.Design, implement, and maintain robust CI/CD pipelines for reliable deployments.Monitor system health, performance, and availability using tools like Dynatrace, Prometheus, Grafana, and ELK stack.Perform incident management, root cause analysis, and implement preventive solutions.Collaborate with development and engineering teams to improve system reliability and performance.Ensure adherence to cloud security, reliability, and operational best practices.Required Skills:7+ years of experience in Site Reliability Engineering, DevOps, or related roles.Strong hands-on expertise in AWS services and scalable architecture design.Proficiency in Python and Bash scripting for automation.Hands-on experience with Terraform, Kubernetes, and Ansible.Strong experience in CI/CD pipeline design and release engineering.Experience with monitoring and observability tools such as Dynatrace, Prometheus, Grafana, ELK, or similar platforms.Strong troubleshooting, analytical, and problem-solving skills.Preferred Qualifications:Prior experience working as a Site Reliability Engineer (SRE).Experience managing production environments with high availability.Strong understanding of cloud security and reliability best practices.
Job Title
Site Reliability Engineer