Job Title:Data Engineer – Medical Imaging & Health Data Experience Required:Minimum 3 Years Location:Chennai Employment Type:Full-timeAbout the Role We are seeking a skilled and experienced Data Engineer with a strong background in medical imaging and healthcare data to join our growing team. The role involves building scalable and secure data infrastructure for powering clinical research, AI-based diagnostics, and healthcare analytics. This role is ideal for someone comfortable working with diverse data types, including medical images, clinical records, and time-series data, and who has experience with modern data lakes, streaming platforms, and distributed databases.Key Responsibilities ● Ingest, process, and manage large-scale medical imaging datasets (e.g., MRI, CT scans, pathology slides) in DICOM and non-DICOM formats. ● Develop robust ETL/ELT pipelines to extract and transform healthcare data from EHRs, PACS, RIS, and medical registries using tools like Apache NiFi, Airflow, or Dagster. ● Build streaming and event-driven pipelines using Kafka, RabbitMQ, and other messaging systems. ● Design scalable storage systems for structured, unstructured, and semi-structured data using data lakes (e.g., Apache Hudi, Iceberg, Delta Lake) over Amazon S3 or MinIO. ● Implement distributed databases (e.g., Cassandra, ClickHouse, MongoDB, ElasticSearch) for various analytical workloads. ● Collaborate with clinicians, researchers, and ML teams to prepare datasets for downstream AI/ML pipelines and analytics platforms. Integrate graph databases for modeling complex biomedical relationships. ● Ensure data security, governance, anonymization, and compliance with HIPAA, GDPR, and related healthcare standards. ● Enable data observability, monitoring, and audit trails for all pipeline components. ● Work with query engines such as Trino for federated query access across systems. ● Support data versioning and reproducibility using DVC. ● Perform data migrations and query optimization across polyglot data systems.Must-Have Skills & Experience ● Bachelor’s or Master’s in Computer Science, Biomedical Engineering, Health Informatics, or a related field. ● Minimum 3 years of experience as a Data Engineer in the healthcare or biomedical domain. ● Expertise in Python, SQL, and handling medical imaging with libraries like pydicom, SimpleITK, or Nibabel. ● Solid understanding of healthcare interoperability standards (DICOM, HL7, FHIR, OMOP). ● Hands-on with distributed systems and databases like Cassandra, MongoDB, ElasticSearch, ClickHouse, TimescaleDB, and Redis. ● Experience with Apache Spark, Apache Kafka, and streaming/event-driven architectures. ● Familiar with data lakes (Apache Hudi, Iceberg, Delta Lake) and cloud/object storage (Amazon S3, MinIO). Proficient with ETL orchestration using Airflow, Dagster, or Apache NiFi. ● Comfortable with messaging queues (RabbitMQ). ● Strong foundation in data security, privacy, and regulatory compliance (HIPAA, GDPR).Good-to-Have Skills ● Experience working with time-series databases: InfluxDB, TimescaleDB, QuestDB. ● Experience with graph databases like OrientDB, RavenDB, or Neo4j. ● Exposure to SQL/NoSQL ecosystems: MySQL, PostgreSQL, MariaDB, HBase, Bytebase. ● Familiarity with Elasticsearch for indexing and search over medical datasets. ● Prior involvement in MLOps, feature stores, or AI/ML lifecycle integration. ● Understanding of data observability and monitoring tools for pipelines. ● Experience with data migration strategies and query performance tuning. ● Exposure to clinical registries, cohort builders, or clinical trial platforms.Kindly share your resume
Job Title
Data Engineer