Skip to Main Content

Job Title


Site Reliability Engineering Lead


Company : Horizon56


Location : Chennai, Tamil Nadu


Created : 2025-05-31


Job Type : Full Time


Job Description

We are seeking an experienced and dynamic Site Reliability Engineering Lead to oversee the reliability, scalability, and performance of our critical systems. In this role, you will lead a team of Technical Support Engineers, managing both day-to-day operations and a 24/7 shift schedule. You will collaborate with cross-functional teams to ensure system availability, optimize infrastructure, and drive continuous improvement initiatives.About the Company:At Horizon56, we are developing and implementing cutting edge software solutions for the oil & gas industry. As a testament to this, what we have delivered so far, has been so well received by our users, that we now are in a situation where we need to deliver our products at scale.The team has a flat, low-key, and fun work environment, where you will be able to bring your opinions and skills to life and build your own future.You will be primarily working from Chennai, India through our partner (Greaves Technologies) with teams located across Norway, Denmark, UK and USA.Key Responsibilities:- Lead, mentor, and manage a team of Technical Support Engineers. - Oversee and manage a 24/7 shift schedule to ensure continuous system monitoring and support. - Develop and enforce best practices for incident management, monitoring, and system performance. - Establish SRE principles to guide operational excellence and system reliability. - Collaborate with software engineering teams to improve service. - Automate repetitive tasks to improve efficiency and reduce manual intervention. - Conduct post-incident reviews and implement corrective actions to prevent future incidents. - Drive capacity planning, performance analysis, and optimization efforts.Qualifications:- Experience managing Technical Support Engineers and overseeing 24/7 operations. - Strong expertise in Azure cloud. Nice to have any scripting experience. - Experience with monitoring and observability tools. - Working Knowledge on ITSM, escalation processes and administration. - Excellent communication and leadership skills.