Tata Consultancy Services is an equal opportunity employer, our commitment to diversity & inclusion drives our efforts to provide equal opportunity to all candidates who meet our required knowledge & competency needs, irrespective of any socio-economic background, race, color, national origin, religion, sex, gender identity/expression, age, marital status, disability, sexual orientation or any others. We encourage anyone interested to build a career in TCS to participate in our recruitment & selection process.TCS is seeking skilled professionals to join our team as SRE.Technical/Functional Skills:Observability & MonitoringDevelop proactive alerting and dashboarding strategies to detect and resolve issues before they impact customers and store operationsDefine and manage Service Level Objectives (SLOs), Service Level Agreements (SLAs), and error budgets for critical store applicationsLead critical incident recovery and postmortem processes to drive continuous improvementPerformance & Reliability EngineeringIdentify and eliminate bottlenecks in development and deployment workflows to improve lead time and reduce change failure ratesPartner with development teams to embed Site Reliability Engineering (SRE) principles into the software development lifecycleSupport and optimize applications deployed across CVS retail and pharmacy locationsCollaborate with infrastructure and store operations teams to ensure high availability and performance of store systemsMicroservices & DeploymentsChampion containerization and orchestration using OpenShift and Kubernetes in hybrid cloud environmentsLeverage CI/CD pipelines to enable automated deployments at scaleUnderstanding of microservices architectureMinimum Qualifications:5+ years of experience in SRE, DevOps, or related technology roles3+ years of experience in delivering software in a large-scale environment with reliability and resilience concepts (multi-region, multi-cloud, containerization, etc.)2+ years of experience with programming languages/frameworks2+ years of experience on Cloud Technologies (AWS, Microsoft Azure, Google Cloud), Microservices concepts, and capabilities like Rancher, Docker, Kubernetes, and web API’s2+ years of experience with source control and continuous integration tools like GitHub, Bitbucket, or JenkinsExperience with observability and monitoring tools such as Splunk, Dynatrace, Datadog, Prometheus, Grafana, etc.Proficiency in scripting and automation frameworksUnderstanding of microservices architecture and cloud-native technologiesExperience in Incident Management, Change Management, Infrastructure Support, and Problem Management concepts and processesExcellent interpersonal and communication skills, including the ability to engage technical and non-technical stakeholder**Work modality: Hybrid**Candidate must be located in or willing to relocate to Querétaro, CDMX, Monterrey, Guadalajara, it will be requested to attend office at least 3 days per week.Boost your career and send your resume to:
Job Title
Site Reliability Engineer