Skip to Main Content

Job Title


Data Engineer


Company : GMG


Location : Meerut, Uttar pradesh


Created : 2025-05-02


Job Type : Full Time


Job Description

What we do:GMG is a global well-being company retailing, distributing and manufacturing a portfolio of leading international and home-grown brands across sport, food and health sectors. Its vision is to inspire people to win in ways that make the world better. Today, GMG’s investments span four key verticals: GMG Sports, GMG Food, GMG Health, and GMG Consumer Goods. Under the ownership and management of the Baker family, it has become a leading global company, affiliated with the world’s most successful and respected brands in the well-being sector. Working across the Middle East, North Africa, and Asia, GMG has introduced more than 120 brands into its markets.What will you do:We are seeking a highly skilled Data Engineer specializing in AWS and Databricks. The ideal candidate will design, build, and maintain scalable data pipelines, ensuring efficient data ingestion, processing, and integration from multiple sources. This role requires expertise in AWS services, PySpark, SQL, and Databricks, along with strong optimization, security, and cost management skills.Roles and Responsibilities:Data Engineering & Pipeline Development• Develop and manage ETL pipelines for structured, semi-structured, and unstructured data using AWS Glue, PySpark, and SQL.• Handle real-time event stream data ingestion and processing from multiple source systems.• Ensure efficient data integration into Databricks for advanced processing and analytics.Cloud & Infrastructure Management• Build and optimize backend systems leveraging AWS services (Glue, Athena, Lambda, SNS, S3).• Implement, configure, and manage Databricks environments, including clusters, notebooks, and libraries for performance optimization.• Ensure optimal resource utilization for AWS and Databricks clusters to improve efficiency and reduce costs.• Integrate Databricks with various cloud services while following governance and security best practices.Testing & CI/CD Best Practices• Write unit test cases and integration tests to ensure data pipeline reliability.• Establish best practices for Databricks CI/CD and implement automation for deployment.Optimization & Security• Apply performance tuning techniques to optimize queries, storage, and processing times.• Ensure compliance with security, governance, and industry best practices across AWS and Databricks environments.• Monitor system performance and proactively address issues to maintain high availability and reliability.Functional/Technical Competencies:Knowledge of Glue, PySpark, SQL, Athena, Lambda, SNS, S3Knowledge of Databricks : Cluster setup, Notebooks, Libraries, CI/CD, OptimizationData Processing: Event stream ingestion and batch processingTesting: Writing unit test cases and integration testsSecurity & Governance: AWS/Databricks governance standards and best practicesPerformance Optimization: Query tuning, cluster performance improvements, cost reductionStrong problem-solving and analytical skillsAbility to work in a fast-paced, cloud-based data environmentExcellent collaboration and communication skillsStrong attention to detail and commitment to best practicesEducational Qualification:Bachelor's in computer science or computer engineeringCertification in Data Engineering and AnalyticsExperience:Minimum 6 years' experience in Data engineering (Core development/design), in which 3+ years in AWS with command on (AWS glue, pyspark, SQL, Athena, lambda, SNS, S3) and 1+ year in Databricks.