Job Title: Data Engineer – Medical Imaging & Health DataExperience Required: Minimum 3 YearsLocation: ChennaiEmployment Type: Full-timeAbout the RoleWe are seeking a skilled and experienced Data Engineer with a strong background inmedical imaging and healthcare data to join our growing team. The role involves buildingscalable and secure data infrastructure for powering clinical research, AI-based diagnostics,and healthcare analytics.This role is ideal for someone comfortable working with diverse data types, including medicalimages, clinical records, and time-series data, and who has experience with modern datalakes, streaming platforms, and distributed databases.Key Responsibilities● Ingest, process, and manage large-scale medical imaging datasets (e.g., MRI, CTscans, pathology slides) in DICOM and non-DICOM formats.● Develop robust ETL/ELT pipelines to extract and transform healthcare data from EHRs,PACS, RIS, and medical registries using tools like Apache NiFi, Airflow, or Dagster.● Build streaming and event-driven pipelines using Kafka, RabbitMQ, and othermessaging systems.● Design scalable storage systems for structured, unstructured, and semi-structured datausing data lakes (e.g., Apache Hudi, Iceberg, Delta Lake) over Amazon S3 or MinIO.● Implement distributed databases (e.g., Cassandra, ClickHouse, MongoDB,ElasticSearch) for various analytical workloads.● Collaborate with clinicians, researchers, and ML teams to prepare datasets fordownstream AI/ML pipelines and analytics platforms.Integrate graph databases for modeling complex biomedical relationships.● Ensure data security, governance, anonymization, and compliance with HIPAA, GDPR,and related healthcare standards.● Enable data observability, monitoring, and audit trails for all pipeline components.● Work with query engines such as Trino for federated query access across systems.● Support data versioning and reproducibility using DVC.● Perform data migrations and query optimization across polyglot data systems.Must-Have Skills & Experience● Bachelor’s or Master’s in Computer Science, Biomedical Engineering, HealthInformatics, or a related field.● Minimum 3 years of experience as a Data Engineer in the healthcare or biomedicaldomain.● Expertise in Python, SQL, and handling medical imaging with libraries like pydicom,SimpleITK, or Nibabel.● Solid understanding of healthcare interoperability standards (DICOM, HL7, FHIR,OMOP).● Hands-on with distributed systems and databases like Cassandra, MongoDB,ElasticSearch, ClickHouse, TimescaleDB, and Redis.● Experience with Apache Spark, Apache Kafka, and streaming/event-drivenarchitectures.● Familiar with data lakes (Apache Hudi, Iceberg, Delta Lake) and cloud/object storage(Amazon S3, MinIO).Proficient with ETL orchestration using Airflow, Dagster, or Apache NiFi.● Comfortable with messaging queues (RabbitMQ).● Strong foundation in data security, privacy, and regulatory compliance (HIPAA, GDPR).Good-to-Have Skills● Experience working with time-series databases: InfluxDB, TimescaleDB, QuestDB.● Experience with graph databases like OrientDB, RavenDB, or Neo4j.● Exposure to SQL/NoSQL ecosystems: MySQL, PostgreSQL, MariaDB, HBase,Bytebase.● Familiarity with Elasticsearch for indexing and search over medical datasets.● Prior involvement in MLOps, feature stores, or AI/ML lifecycle integration.● Understanding of data observability and monitoring tools for pipelines.● Experience with data migration strategies and query performance tuning.● Exposure to clinical registries, cohort builders, or clinical trial platforms.
Job Title
Data Engineer