Skip to Main Content

Job Title


Lead Data Engineer


Company : Health Catalyst


Location : New delhi, Delhi


Created : 2026-03-20


Job Type : Full Time


Job Description

The healthcare industry is the next great frontier of opportunity for software development, and Health Catalyst is one of the most dynamic and influential companies in this space. We are working on solving national-level healthcare problems, and this is your chance to improve the lives of millions of people, including your family and friends. Health Catalyst is a fast-growing company that values smart, hardworking, and humble individuals. Each product team is a small, mission-critical team focused on developing innovative tools to support Catalyst’s mission to improve healthcare performance, cost, and quality. Health Catalyst is expanding and maintains a large suite of Improvement Apps that contribute to healthcare analytics and process improvement solutions. This includes products that manage the care of health system populations, better serve patients at the point of care, reduce health system costs, and reduce clinician workload. Job Summary: As a Lead Data Engineer, you will be working with diverse Improvement Apps, software engineering team designing, developing, and maintaining various platforms that serve internal HCAT team members, clinicians, and patients. You will rely on Test-Driven Development to safely enhance and refactor our system, shipping production code multiple times per week. And you will go to bed each night with the comfort that your code is improving outcomes for patients. If you love… Help drive clarity and prototype individual features or problems Knowledge of architecture patterns and the ability to design and complete features / tasks that are 50-60% well defined. Can discern where gaps can be filled in without consulting a Product Manager or another programmer and can judge when a consultation is needed. Work is reviewed with the occasional need for material direction or implementation changes Seeks and provides guidance via PR reviews, pair-programming and other interactions with Engineers and Product Managers It is second nature to develop high code quality standards balanced with the needs of real-world customer timelines. Possesses a passion and drive to deliver exceptional products and follows established patterns and approaches within existing code bases with ease. Takes ownership of learning and growth Capitalizes on internal and external opportunities for learning. Identifies gaps in knowledge/skills and seeks ways to close those gaps (self-guided learning, pairing, seeking guidance for yourself and developing guidance for less experienced members of the team) Periodic On Call Rotation Ability to communicate with Customer Success about customer issues that are escalated to Engineering and help quantify customer impact. Can Respond quickly to operational emergencies, find short term resolutions and plan long term fixes to avoid similar issues in the future. What you own in the role: Design, develop, and optimize complex SQL queries, stored procedures, and data models to support large-scale analytics and reporting pipelines for patient engagement and clinical outcomes data. Architect and implement scalable data ingestion, transformation, and processing workflows using PySpark on Databricks, ensuring high performance and reliability across batch and streaming pipelines. Lead the design and implementation of enterprise-grade data platforms, including Delta Lake architecture on Databricks, enforcing data quality standards, partitioning strategies, and schema evolution best practices aligned with Health Catalyst's ML/AI services. Build and maintain robust ETL/ELT pipelines to acquire data from primary and secondary sources — including relational databases, HL7/FHIR feeds, and flat files — integrating them into unified data products and analytics-ready datasets. Develop and enforce data quality frameworks using Databricks Delta Live Tables and custom PySpark validation logic to proactively detect, flag, and resolve data integrity issues across pipelines. Collaborate with data science, ML engineering, and product teams to translate business and analytical requirements into scalable data infrastructure, and drive prioritization of data platform improvements aligned with organizational goals. Continuously evaluate and identify opportunities to refactor legacy SQL-based workflows into optimized PySpark pipelines, improving pipeline throughput, cost efficiency, and maintainability on the Databricks platform. What you bring to this role: Bachelor's degree or equivalent practical experience preferred. Strong working knowledge of SQL Technical expertise regarding data models, database design development, data mining and segmentation techniques Strong knowledge of and experience with reporting software such as Power BI, BusinessObjects, Looker, Tableau, etc. (Looker experience preferred) Strong analytical skills with the ability to collect, organize, analyze, and disseminate significant amounts of information with attention to detail and accuracy, in a timely manner Adept at constructing efficient queries, writing reports and presenting findings Ability to manage multiple and simultaneous responsibilities and to prioritize scheduling of work Strong verbal and written communication skills An understanding of healthcare data is a plus, but not a requirement You may also bring: Experience with cloud infrastructure and architecture patterns, either Azure or AWS preferred. Software development experience within healthcare IT and understands key data models (clinical, claims, financial, etc.) and interoperability standards such as HL7v2, CDA, EMR, and FHIR Knowledge of healthcare compliance and how it applies to Application Security Agile/Scrum software development practices Business Intelligence or Data warehousing experience Preferred Experience and Education: BS/BA or MS in Computer science, information systems, or other technology/science degree. A minimum of 7+ years of experience in building commercial software, SaaS, or digital platforms. Please note: We currently have multiple available roles and are open to various levels of experience to fill those roles. We will consider junior, mid, and senior level experience on a case-by-case basis. If you feel this role is a match for your skills and experience, we encourage you to apply. Equal Employment Opportunity has been, and will continue to be, a fundamental principle at Health Catalyst, where employment is based upon personal capabilities and qualification without discrimination or harassment on the basis of race, color, national origin, religion, sex, sexual orientation, gender identity, age, disability, citizenship status, marital status, creed, genetic predisposition or carrier status, sexual orientation or any other characteristic protected by law.. Health Catalyst is committed to a work environment where all individuals are treated with respect and dignity