Key Responsibilities Lead and mentor engineering teams to adoptSRE principles and modern DevOps practices Design, build, and maintain reliable infrastructure usingpublic cloud platforms (AWS, GCP, or Azure) Ensure operational excellence by implementingobservability toolslike Prometheus, Grafana, AWS CloudWatch, Splunk, and AppDynamics Troubleshoot and optimize performance of large-scale distributed systems and containers Collaborate with cross-functional teams to ensure best practices inCI/CD pipelinesusing tools like Jenkins, TeamCity, and Octopus Deploy Contribute to technical strategy, roadmap definition, and best practice frameworks for the SRE functionEssential Skills & Experience14+ years of experiencein technology roles, with strong software engineering expertise Proficient in at least one modern programming language —Golang or Python preferred Extensive knowledge ofLinux internals, networking, and containerizationtechnologies Strong experience working withpublic cloud platforms (AWS, GCP, or Azure) Proven track record of applyingSRE practicesin large-scale enterprise environments Hands-on experience withCI/CD toolslike TeamCity, Jenkins, Octopus Deploy, or similar Excellent leadership, communication, and problem-solving skills
Job Title
Chief Site Reliability Engineer