Skip to Main Content

Job Title


Data Center Operations Engineer


Company : Covestic Inc


Location : Toronto, Ontario


Created : 2026-01-22


Job Type : Full Time


Job Description

Description Seeking an experienced Data Center Operations Engineer to ensure environments run with precision, efficiency, and uptime across global sites. This role will bridge IT and facilities, maintaining the power, cooling, and compute systems that sustain the companys world-class AI platforms. This role requires a technically strong, detail-oriented engineer who thrives in high-availability environments. Must understand the full stack of data center infrastructure: Compute, Network, Power, and Cooling, and must take pride in systems that run flawlessly because of your work. Be able to communicate clearly, perform methodically under pressure, and collaborate effectively across IT, facilities, and vendor teams. This role requires a builder, a problem-solver, and a guardian of uptime, someone who values precision, safety, and accountability in every aspect of operations. Responsibilities Own the day-to-day reliability and performance of company data centers, supporting both IT and facility infrastructure. This includes installing and configuring servers and compute equipment, managing structured cabling, and performing Layer 13 troubleshooting across compute and network layers. Partner closely with colocation and data center providers to maintain uptime reviewing maintenance procedures, coordinating planned work, validating redundancy during transitions, and verifying site health after power or cooling events. Work alongside facilities teams, youll help operate and maintain critical power and cooling systems, including transformers, PDUs, UPS, switchgear, generators, CRAC and CRAH units, CDUs, chillers, cooling towers, and containment systems. Youll assist in capacity planning, preventive maintenance, and load balancing across power and cooling zones to maintain safe, efficient, and redundant operations. Lead incident response and rootcause analysis, refine standard operating procedures, and implement automation to improve efficiency and consistency across company data centers worldwide. Required Skills 10+ years in data center compute operations, facilities, or infrastructure engineering and/or a degree in an Engineering or Computer Science discipline Handson experience with servers, networking, and structured cabling Working knowledge of electrical systems including transformers, PDUs, UPS, switchgear, and generators Understanding of cooling systems including CRAC/CRAH units, CDUs, cooling towers, chillers, and containment environments Familiarity with Linux and basic scripting (Bash, Python, Ansible) Proficiency with network CLIs (Cisco, Arista, Juniper) Experience collaborating with colocation providers and reviewing MOP/EOPs for electrical and mechanical work Proficiency with ITSM/DCIM platforms (e.g., Jira, ServiceNow, NetBox, Sunbird) Ability to manage server, switch, router, storage, and hardware lifecycle processes Ability to update asset management systems using scanners and inventory tools Strong documentation, troubleshooting, and communication skills for ticketing, customer communication, and team coordination Strong multitasking, adaptability, and timemanagement skills with a focus on quality and throughput Must be punctual, reliable, and wellorganized Must have strong interpersonal and teamwork skills, with the ability to work independently when needed Willingness to support oncall rotation and meet a 60minute onsite SLA Ability to safely lift 5075 lbs and remain on feet for majority of the workday Ability to operate materialhandling equipment (pallet jacks, forklifts, serverlift) Demonstrated ability to learn new systems, methodologies, software, and hardware platforms Experience working in hightempo, highstress environments Experience leading and/or mentoring more junior staffers Domain expert in one or more of the following functional areas: Datacenter power systems Datacenter cooling/HVAC systems Server or liquid cooling Network routing and switching Late generation flash storage arrays Facility and/or network security Network infrastructure monitoring Ability to project manage key datacentercentric initiatives Able to effectively present data to senior leadership Familiar with datacenter key performance indicators (KPI) Ability to manage outage events Be a good person and good team mate Preferred Certifications CompTIA Server+, Network+, or Linux+ ITIL Foundation certification Networking: CCNA, JNCIA, ACEA #J-18808-Ljbffr