Job Summary: We are looking for a highly experienced and sharpDataLake Implementation Specialistto lead and execute scalable data lake projects using technologies such asApache Hudi, Hive, Python, Spark, Flink , andcloud-native tools on AWS or Azure . The ideal candidate must have deep expertise in designing and optimizing modern data lake architectures with strong programming skills and data engineering capabilities.Key Responsibilities: Design, develop, and implement robustdata lake architectureson cloud platforms (AWS/Azure). Implementstreaming and batch data pipelinesusingApache Hudi , Apache Hive, and cloud-native services likeAWS Glue ,Azure Data Lake , etc. Architect and optimize ingestion, compaction, partitioning, and indexing strategies inApache Hudi . Develop scalable data transformation and ETL frameworks usingPython ,Spark , andFlink . Work closely with DataOps/DevOps to build CI/CD pipelines and monitoring tools for data lake platforms. Ensure data governance, schema evolution handling, lineage tracking, and compliance. Collaborate with analytics and BI teams to deliver clean, reliable, and timely datasets. Troubleshoot performance bottlenecks in big data processing workloads and pipelines.Must-Have Skills: 4+ yearshands-on experience inData Lake and Data Warehousingsolutions 3+ yearsexperience withApache Hudi , including insert/upsert/delete workflows, clustering, and compaction strategies Strong hands-on experience inAWS Glue ,AWS Lake Formation , orAzure Data Lake / Synapse 6+ yearsof coding experience inPython , especially in data processing 2+ yearsworking experience inApache Flinkand/orApache Spark Sound knowledge ofHive ,Parquet/ORC formats , andDeltaLake vs Hudi vs Iceberg Strong understanding ofschema evolution ,data versioning , andACID guaranteesin data lakesNice to Have: Experience withApache Iceberg ,Delta Lake Familiarity withKinesis ,Kafka , or any streaming platform Exposure todbt ,Airflow , orDagster Experience indata cataloging ,data governance tools , andcolumn-level lineage trackingEducation & Certifications: Bachelor’s or Master’s degree in Computer Science, Information Technology, or related field Relevant certifications inAWS Big Data ,Azure Data Engineering , orDatabricks
Job Title
Data Engineer/ Data Lake Implementation Specialist