Skip to Main Content

Job Title


Lead Site Reliability Engineer


Company : Tyson Foods India


Location : Bangalore, Karnataka


Created : 2026-02-05


Job Type : Full Time


Job Description

Job Description – Lead Site Reliability Engineer (Data Observability)We are seeking a seasoned Lead Site Reliability Engineer with over 10-12 years of technical expertise in cloud infrastructure and cloud operations, with preferable knowledge and understanding of data analytics and platforms.The role as Lead Site Reliability Engineer in the Data & Analytics organization, is to lead efforts in ensuring architecting, designing, and leading the implementation of data observability solutions that help organizations monitor data pipelines, improve data quality, deliver reliable data products, and optimize cost spend. This role combines advanced engineering leadership with hands-on development, mentorship, and close collaboration with cross-functional teams. At Tyson Foods, the Lead Site Reliability Engineer will have the opportunity to work on cutting-edge technologies and collaborate with talented professionals in a dynamic environment. We offer a culture that values innovation, growth, and work-life balance, along with opportunities for career advancement and professional development.Primary Job Responsibilities:Lead the design, development, and delivery of data observability features and solutions required for robust monitoring of data, data pipelines and data platforms.Mentor and guide junior engineers, promoting technical growth and driving best practices for observability.Collaborate with stakeholders (data engineering, operations, product teams) to define requirements and implement scalable observability frameworks.Build and maintain tools for real-time monitoring, logging, alerting, and anomaly detection across distributed data systems.Lead incident response efforts, conduct root cause analysis, and implement corrective actions to prevent recurrence.Develop automation scripts and tools to streamline operational processes and improve efficiency.Optimize performance, cost, and scalability of observability pipelines and solutions.Ensure observability platforms align with compliance, governance, security, and enterprise best practices.Foster adoption of observability solutions and create documentation, playbooks, and training materials for teams.Drive continuous improvement initiatives to enhance reliability, scalability, and performance of our cloud infrastructure.This role requires strong technical expertise as well as excellent problem-solving and communication skills.Qualifications:Minimum 10 years of experience in a Site Reliability Engineering or similar role, with a strong focus on cloud infrastructure and cloud operations.8+ years and deep expertise in cloud platforms such as GCP, AWS or Azure.5+ years and proficiency in infrastructure as code (IaC) tools such as Terraform, CloudFormation, or Ansible.Strong knowledge of containerization and orchestration technologies (e.g., Kubernetes, Docker).Experience with CI/CD pipelines and DevOps practices.Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack).5+ years in scripting and programming skills (e.g., Python, Bash, SQL).Experience in cross-team leadership, technical strategy, and providing architectural guidance on monitoring solutionsExperience in mentoring technical teams and driving projects to successful completion.Excellent communication skills with the ability to collaborate effectively across teams and influence stakeholders.Strong analytical and troubleshooting skills, with the ability to analyze complex issues and drive solutions.Bachelor’s degree in Computer Science, Engineering, or a related fieldCertification in relevant cloud platforms (any of GCP, AWS, AZURE).Familiar with Agile Methodology concepts. Good to have:Bachelor’s or Master’s degree in Computer science, engineering field.Strong understanding of Data Analytics and Data Platforms over cloud (GCP, AWS, Azure)Experience with data observability tools (e.g., Monte Carlo, Bigeye, Open Metadata, Acceldata). Familiarity with data ingestion (e.g., Fivetran, HANA/SLT), data processing (e.g., dbt), and visualization (e.g., Power BI).