About Ticketek Entertainment Group Ticketek Entertainment Group is a global fanexperience company that tickets, promotes and delivers incredible live experiences that are impossible to forget. In a distracted world where nothing beats real human moments, we make life better live! Our Group includes a Fan Experience Platform (Ticketek) that sells tickets and provides valueadded services, event promotion across touring, sport and family experiences, and a digital business (Ovation) focused on delivering seamless datadriven outcomes for fans and partners. About the Role We are hiring a Site Reliability Engineer to join TEGs Technology department and champion the reliability and scalability of our global live entertainment platforms. In this role, you will apply softwareengineering principles to operations, proactively enhancing system reliability and preventing outages to deliver seamless experiences for our customers across our global ticketing platform. Locations: Adelaide, Sydney, Melbourne or Brisbane. Responsibilities Proactively guard the health, availability, and performance of TEG''s critical global production systems Engineer and automate robust monitoring and autohealing solutions to proactively prevent outages and meet service level objectives (SLOs) Drive InfrastructureasCode (IaC) principles for provisioning and deploying our highly available, scalable platforms Lead critical incident response efforts, ensuring rapid resolution and restoration of platform stabilitty Provide technical leadership during major incidents, focusing on swift problem analysis and effective communication to stakeholders Transform incidents into progress by conducting deep postmortems and driving the implementation of strategic preventative measures across various teams Build and maintain highperforming, faulttolerant distributed systems emphasizing resiliency and efficiency Elevate operational maturity by continuously improving processes, tooling, and efficiency across the department Champion operational excellence and shared responsibility, collaborating with development and other teams to improve processes and tools Innovate system design by evaluating and integrating new technologies to enhance reliability, scalability, and security Mentor and coach colleagues, elevating the overall reliability engineering capability and maturity of the Technology department Essential Experience & Skills Mastery of highly available, faulttolerant AWS system design and management Strong foundation in AWS networking (VPC, Route53) and security best practices Proficiency in key scripting languages (Python, Bash, PowerShell) for automation Proven ability to perform effectively under pressure, managing highvolume tasks and meeting tight deadlines Minimum of 3 years of prior SRE or DevOps experience Expert knowledge of fundamental infrastructure concepts (Networking, Containerisation, Virtualisation, DNS) Working familiarity with key CI/CD and InfrastructureasCode tools (e.g., Terraform, Ansible, Jenkins) Excellent verbal and written communication skills Desirable Experience & Skills Handson experience with the ELK Stack or advanced monitoring tools (Prometheus/Grafana) Relevant AWS certifications (e.g., AWS Certified SysOps Administrator or DevOps Engineer Professional) Demonstrated ability to optimise AWS costs while maintaining performance and reliability Benefits Complimentary event tickets Birthday and volunteering leave Wellbeing discounts & flu vaccinations Paid parental leave & free employee support (EAP) Global rewards and recognition Learning, development & career pathways A diverse, inclusive, and passionate team Equal Opportunities TEG is an equalopportunity employer committed to embracing diversity, respecting, and caring for our people and communities. If there are any adjustments needed to ensure you have a fair and equitable experience in our recruitment process, please advise us when scheduling your interview. Only direct applications will be considered. No recruiters please. #J-18808-Ljbffr
Job Title
Site Reliability Engineer