Actual annual compensation offered will be based on several variables including geographic location, work experience, education, and skills/ achievements, and will be mutually agreed upon at the time of offer. The average compensation for this role is $85,000-105,000 CAD About the Role As an Application Observability Engineer, youll operate at the intersection of applications, infrastructure, and reliability. You will support and troubleshoot the internal platforms and microservices that power critical business systems, ensuring services are healthy, observable, and performant in a large-scale enterprise environment. Youll partner closely with application developers, platform engineers, operations teams, and system administrators to investigate production issues, validate deployments, and maintain stable environments. This role is ideal for someone early to mid-career (14 years experience) who enjoys handson troubleshooting, learning how distributed systems work in practice, and supporting modern, containerized platforms using tools like Kibana, Grafana, Jaeger, VictoriaMetrics, Redis and Kubernetes. What Youll Do Support and troubleshoot enterprise platforms Monitor and support internal platforms and microservices running across server-based and containerized environments. Investigate production issues by analyzing logs, metrics, and system health signals to identify root causes. Troubleshoot application-level failures, performance issues, and connectivity problems across distributed systems. Work with containerized and microservices environments View, manage, and troubleshoot containerized workloads, including application deployments and configuration changes. Understand service lifecycles, health checks, and when corrective actions (such as restarts or escalations) are required. Leverage AI-assisted tools to accelerate troubleshooting, analysis, and documentation while maintaining sound engineering judgment. Maintain configuration and application health Review and maintain application configuration using centralized configuration management approaches. Validate application health endpoints and diagnostic signals to ensure services are operating as expected. Support application deployments to servers and platform environments following established processes. Troubleshoot supporting platform services Investigate issues related to platform dependencies such as caching or inmemory data stores (e.g., Redis). Identify common failure modes such as configuration errors, resource exhaustion, or networkrelated issues that impact application behavior. Test and validate APIs and services Test, validate, and troubleshoot APIs using industrystandard tools to confirm expected behavior. Work with development teams to reproduce issues and verify fixes before and after deployment. Collaborate across engineering and operations teams Partner with developers, platform engineers, and operations teams to resolve incidents and improve platform stability. Document troubleshooting steps, findings, and operational runbooks to improve team knowledge and response time. What Were Looking For Required 14 years of relevant experience in systems engineering, platform engineering, application support, DevOps support, or a related technical role. Bachelors degree in Computer Science, Information Technology, or a related field, or equivalent practical experience. Experience troubleshooting applications in enterprise or distributed system environments. Familiarity with containerized platforms and microservices architectures. Experience using logging, monitoring, and observability tools to diagnose issues. Coding or scripting knowledge (e.g., Java, Python, Bash) to assist with troubleshooting and automation. Experience testing and validating APIs. Working knowledge of networking concepts as they relate to applications and containerized workloads. Strong analytical, problemsolving, and communication skills. Preferred Exposure to cloudnative environments and CI/CD pipelines. Experience troubleshooting caching systems or inmemory data stores (e.g., Redis) and using logging tools like Kibana (ElasticSearch). Familiarity with application health checks, diagnostics, and service monitoring patterns. Experience working in largescale enterprise environments with multiple teams and shared platforms. Site Reliability Engineering (SRE) experience Working Conditions & Flexibility hybrid Schedule: Occasional nonstandard hours or overtime may be required based on business needs; some oncall availability may also be necessary for critical production support. Travel: May include local (incountry) and global travel for key meetings or team events. La rmunration annuelle relle offerte sera dtermine en fonction de plusieurs facteurs, notamment la rgion gographique, lexprience de travail, la formation ainsi que les comptences et ralisations. Elle sera convenue mutuellement au moment de loffre. La rmunration moyenne pour ce poste se situe entre 85 000 $ et 105 000 $ CAD. propos du poste En tant Ingnieur en observabilit des applications, vous travaillerez lintersection des applications, de linfrastructure et de la fiabilit. Vous soutiendrez et dpannerez les plateformes internes et les microservices qui alimentent des systmes daffaires essentiels, en veillant ce que les services demeurent stables, observables et performants dans un environnement dentreprise grande chelle. Vous collaborerez troitement avec les dveloppeurs dapplications, les ingnieurs de plateformes, les quipes dexploitation et les administrateurs systmes pour enquter sur les problmes en production, valider les dploiements et maintenir des environnements stables. Ce rle convient parfaitement une personne en dbut ou milieu de carrire (1 4 ans dexprience) qui aime rsoudre des problmes concrets, comprendre le fonctionnement rel des systmes distribus et soutenir des plateformes modernes et conteneurises laide doutils comme Kibana, Grafana, Jaeger, VictoriaMetrics, Redis et Kubernetes. Vos responsabilits Soutenir et dpanner les plateformes dentreprise Surveiller et soutenir les plateformes internes et les microservices fonctionnant dans des environnements bass sur serveurs ou conteneurs. Enquter sur les problmes de production en analysant les journaux, les mtriques et les signaux de sant du systme afin didentifier les causes profondes. Dpanner les dfaillances applicatives, les problmes de performance et les enjeux de connectivit dans des systmes distribus. Travailler avec des environnements conteneuriss et microservices Visualiser, grer et dpanner les charges de travail conteneurises, incluant les dploiements applicatifs et les changements de configuration. Comprendre les cycles de vie des services, les vrifications de sant et dterminer quand des actions correctives (comme des redmarrages ou des escalades) sont ncessaires. Utiliser des outils assists par lIA pour acclrer le dpannage, lanalyse et la documentation tout en conservant un jugement dingnierie solide. Maintenir la configuration et la sant des applications Examiner et maintenir la configuration des applications laide dapproches centralises de gestion de configuration. Valider les points de terminaison de sant et les signaux diagnostiques pour sassurer que les services fonctionnent comme prvu. Soutenir les dploiements dapplications sur les serveurs et les environnements de plateforme selon les processus tablis. Dpanner les services de plateforme de soutien Enquter sur les problmes lis aux dpendances de plateforme comme les systmes de cache ou les magasins de donnes en mmoire (ex. Redis). Identifier les modes de dfaillance courants tels que les erreurs de configuration, lpuisement des ressources ou les problmes rseau affectant le comportement des applications. Tester et valider les API et services Tester, valider et dpanner les API laide doutils standard de lindustrie pour confirmer le comportement attendu. Travailler avec les quipes de dveloppement pour reproduire les problmes et vrifier les correctifs avant et aprs les dploiements. Collaborer avec les quipes dingnierie et dexploitation Travailler en partenariat avec les dveloppeurs, les ingnieurs de plateforme et les quipes dexploitation pour rsoudre les incidents et amliorer la stabilit des plateformes. Documenter les tapes de dpannage, les constats et les guides oprationnels afin damliorer les connaissances de lquipe et les temps de rponse. Ce que nous recherchons Exigences : 1 4 ans dexprience pertinente en ingnierie des systmes, ingnierie de plateforme, soutien applicatif, soutien DevOps ou rle technique connexe. Baccalaurat en informatique, technologies de linformation ou domaine connexe, ou exprience pratique quivalente. Exprience en dpannage dapplications dans des environnements dentreprise ou de systmes distribus. Familiarit avec les plateformes conteneurises et les architectures microservices. Exprience avec des outils de journalisation, de surveillance et dobservabilit pour diagnostiquer des problmes. Connaissances en programmation ou en scripts (ex. Java, Python, Bash) pour aider au dpannage et lautomatisation. Exprience en test et validation dAPI. Connaissances de base en rseautique applies aux applications et charges de travail conteneurises. Excellentes aptitudes analytiques, de rsolution de problmes et de communication. Atouts : Exposition aux environnements infonuagiques et aux pipelines CI/CD. Exprience en dpannage de systmes de cache ou de magasins de donnes en mmoire (ex. Redis) et utilisation doutils de journalisation comme Kibana (ElasticSearch). Familiarit avec les vrifications de sant applicative, les diagnostics et les modles de surveillance de services. Exprience dans des environnements dentreprise grande chelle avec plusieurs quipes et plateformes partages. Exprience en ingnierie de fiabilit des sites (SRE). Conditions de travail et flexibilit Modle de travail : hybride Horaire : Des heures non standard ou du temps supplmentaire peuvent tre requis selon les besoins daffaires; une disponibilit en rotation (oncall) peut aussi tre ncessaire pour le soutien critique en production. Dplacements : Possibilit de dplacements locaux (dans le pays) ou internationaux pour des runions cls ou des vnements dquipe. Key Skills Application Monitoring, ElasticSearch, Grafana, IT Production Support, Kubernetes, Site Reliability Engineering At TD SYNNEX, our values guide everything we do: Together, We Own It, We Dare to Go, We Grow and Win, and above all, We Do the Right Thing. These principles shape how we work with each other, our partners, and our communities as we drive innovation and create lasting impact. Whats In It For You? Elective Benefits: Our programs are tailored to your country to best accommodate your lifestyle. Grow Your Career: Accelerate your path to success (and keep up with the future) with formal programs on leadership and professional development, and many more on-demand courses. Elevate Your Personal Well-Being: Boost your financial, physical, and mental well-being through seminars, events, and our global Life Empowerment Assistance Program. Diversity, Equity & Inclusion: Its not just a phrase to us; valuing every voice is how we succeed. Join us in celebrating our global diversity through inclusive education, meaningful peertopeer conversations, and equitable growth and development opportunities. Make the Most of our Global Organization: Network with other new coworkers within your first 30 days through our onboarding program. Connect with Your Community: Participate in internal, peerled inclusive communities and activities, including business resource groups, local volunteering events, and more environmental and social initiatives. Dont meet every single requirement? Apply anyway. At TD SYNNEX, were proud to be recognized as a great place to work and a leader in the promotion and practice of diversity, equity and inclusion. If youre excited about working for our company and believe youre a good fit for this role, we encourage you to apply. You may be exactly the person were looking for! #J-18808-Ljbffr
Job Title
Application Observability Engineer / Ingénieur en observabilité des applications