Technical Skills Required:- At least Intermediate level in AWS ETL (Glue, lambda, Batch, EMR) & API Development(Fast API + Core python), Redshift - AWS Redshift: - Worked on creating provisioned clusters - Expert knowledge of Workload management - Materialized View, DDL Optimizations, Json handling - AWS Glue to Redshift Loads - Row & Column Level Security - AWS Glue (ETL, Jobs, Crawlers, Catalogue including Iceberg, Workflows, Dynamic Data Frames) - Worked on Batch & Stream workloads - Performance Tuning & Cost Optimization - IAM, KMS, Secret Manager and Fine-Grained Access Control (Encryption good to have) - PySpark, Python - Lambda, Amazon S3, Athena, RDS - Apache Parquet, JSON, CSV - Data Lake Design & ImplementationGood to have:- CI/CD (Terraform or AWS CDK or CloudFormation) - Data Lineage & Governance - Kinesis/Kafka/Glue Streaming/AWS BatchKey Responsibilities:- Collaborate with business stakeholders to analyze data requirements and define ETL & API architecture aligned with business goals. - Design and implement scalable ETL workflows using AWS Glue and PySpark for ingesting structured and semi-structured data into the AWS data lake. - Develop reusable Glue jobs and crawlers for automated metadata cataloging and data transformations. - Optimize Glue job performance using dynamic frame partitioning, job bookmarking, and parallelism tuning. - Integrate Glue with other AWS services such as S3, Athena, Redshift, Lambda, and CloudWatch for end-to-end orchestration and monitoring. - Lead data lakehouse implementation leveraging Glue with Iceberg for versioned, transactional data storage. - Ensure secure access to datasets using fine-grained IAM policies and Lake Formation(Good to Have). - Mentor junior engineers, enforce coding best practices, and participate in code reviews and architectural discussions.
Job Title
Senior Data Engineer-AWS