OverviewSenior Site Reliability Engineer- Remote Location: Canada (remote) About ClickHouseRecognized on the 2025 Forbes Cloud 100 list, ClickHouse is one of the most innovative and fast-growing private cloud companies. With more than 3,000 customers and ARR that has grown over 250 percent year over year, ClickHouse leads the market in real-time analytics, data warehousing, observability, and AI workloads. The company''s sustained, accelerating momentum was recently validated by a $400M Series D financing round. Customers include Capital One, Lovable, Decagon, Polymarket, and Airwallex, in addition to brands such as Meta, Cursor, Sony, and Tesla. We''re on a mission to transform how companies use data. About the roleWe are expanding our central Site Reliability Engineering team to provide reliable and secure services. You will be responsible for building and leading processes to ensure the reliability, availability, scalability, and performance of our cloud infrastructure. You will collaborate with Control Plane, Data Plane, Core, Security, Support and Operations teams to design and implement scalable, secure, highly available and fault-tolerant distributed systems. You will own incident management and response, post-mortem analysis including blameless postmortems, and continuous improvement of our Cloud services. You will leverage software engineering to develop platforms and tools to optimize operational and engineering efficiencies of ClickHouse Cloud. This role offers the opportunity to impact our elastic, high-performance ClickHouse Cloud at scale. What will you do?Collaborate with engineering teams to design and implement scalable, secure, and highly available systems for ClickHouse. Establish and manage service level objectives (SLOs) and service level agreements (SLAs) for ClickHouse Cloud. Ensure infrastructure components in ClickHouse Cloud have monitoring and alerting to detect and resolve incidents. Improve incident response processes and post-mortem analysis, including communicating with impacted customers through the support team. Continuously improve reliability and performance of ClickHouse services. Plan, enable, and drive Chaos initiatives across Engineering teams based on internal priorities. Manage on-call processes and establish best practices for escalation to minimize downtime. About youBachelor''s or Master''s degree in Computer Science or related field. At least 8 years of experience in Site Reliability Engineering or related field. Hands-on experience with Go and/or Python. Strong knowledge of cloud platforms (AWS, Azure, GCP). Excellent understanding of distributed databases and SQL; ClickHouse expertise is a plus. Hands-on experience with container orchestration tools (Kubernetes, Docker Swarm). Experience with automation and configuration management tools (Ansible, Terraform, Puppet). Strong problem-solving and production debugging skills. Focus on efficiency, availability, scalability, and data governance. Ability to thrive in a fast-paced environment and partner with the business to move the company forward. High level of responsibility, ownership, and accountability. Excellent communication and interpersonal skills. CompensationFor roles based in the United States, the typical starting salary range is listed above; in certain locations such as San Francisco Bay Area and New York City Metro Area, premium ranges may apply. These ranges reflect the minimum and maximum pay at posting and may be adjusted in the future. An individual''s placement within the range depends on factors including education, qualifications, certifications, experience, skills, location, performance, and business needs. Flexible work environment- Remote-friendly, 20 countries in operation. Healthcare- Employer contributions toward healthcare. Equity- Stock options for new team members. Time off- Flexible time off in the US, generous entitlement elsewhere. A $500 Home office setup- for remote employees. Global Gatherings- Opportunities to engage with colleagues at company-wide offsites. CultureWe shape our culture together as part of ClickHouse''s first 500 employees. Learn more about our values and culture on our blog and LinkedIn. Equal Opportunity & PrivacyClickHouse provides equal employment opportunities to all employees and applicants and prohibits discrimination and harassment of any type based on protected characteristics. Please see our Privacy Statement for details. Voluntary Self-IdentificationFor government reporting, we invite candidates to respond to the voluntary self-identification survey. Completion is voluntary, confidential, and will not affect hiring decisions. If you have questions, visit the OFCCP website.#J-18808-Ljbffr
Job Title
Senior Site Reliability Engineer- Remote