Skip to Main Content

Job Title


Senior Site Reliability Engineer(Lead)


Company : ACL Digital


Location : Kolhapur, Maharashtra


Created : 2026-01-24


Job Type : Full Time


Job Description

Job Description :- Continuous monitoring of system performance and identify potential issues before they impact users.- Experience working with Industry leading monitoring tools.- Respond to incidents related to monitoring systems, troubleshooting Level 1 issues and resolving issues promptly.- Analyze monitoring data to identify trends, anomalies, to identify potential issues.- Cross-collaboration with development and platform teams to enhance application performance and reliability.- Participate in on-call rotation to support production systems.- Automate monitoring processes to improve efficiency and reduce manual overhead.- Creation of dashboards- Recommend for best practices in observability, alerting, and incident management.- Understanding of distributed systems and microservices architecture.- Conduct post-incident reviews and refine monitoring practices based on findings.- Provide support during major incidents, coordinating response efforts and communication.Qualifications :- Proven experience in a Site Reliability Engineer or DevOps role.- Hands-on Experience with AWS cloud platforms and Docker, Kubernetes containerization technologies.- Understanding of networking, Linux Operating systems, Database management.- Familiarity with monitoring and logging tools (e.g., Prometheus, Grafana, ELK Stack).- Excellent problem-solving skills and a proactive attitude.- Strong communication and collaboration skills.- Experience in incident management and response frameworks.- Familiarity with incident management tools like PagerDuty is an added advantage.Skill sets -1. AWS2. Linux3. NOC (Datadog, Grafana, Prometheus)4. Kubernetes5. Database Management6. Terraform7. Automation