Job Description

We are looking for an experienced Python Data Engineer to design and build scalable data solutions and accelerators, enabling seamless migration from legacy platforms to modern cloud-based architectures. This role is central to the development and optimization of high-performance data pipelines using Python and PySpark on platforms like Azure Databricks. You will work closely with internal stakeholders and external partners to deliver robust, reusable, and observable data products.Key Responsibilities:Design & Development: Build reusable accelerators and automation solutions to facilitate migration from legacy data platforms (e.g., Teradata) to Azure Databricks.Pipeline Engineering: Develop and optimize large-scale ETL/ELT data pipelines using Python, PySpark, and SQL, ensuring performance, cost efficiency, and scalability.Code Reusability & Testing: Champion standards-based development with reusable code modules, test-driven development (TDD), unit testing, and automation.Monitoring & Observability: Implement frameworks to monitor pipeline quality, performance, and operational KPIs, ensuring full observability and data reliability.Cloud Engineering: Architect and deploy cloud-native data solutions, leveraging Azure services such as ADF, ADLS, Azure SQL, and Databricks.Collaboration: Work with internal teams (product, data science, domain leads) and external vendors to define and refine solution requirements and SLAs.Architecture & Integration: Collaborate with enterprise architects to evolve the data platform architecture. Integrate with both on-premise and multi-cloud systems securely and efficiently.Experimentation Support: Partner with data scientists to support and scale advanced analytical models and ML experiments.Documentation & Knowledge Sharing: Create clear and comprehensive technical documentation for reuse and onboarding.Qualifications:Experience: 7+ years in tech with 4+ years in data engineering, software development, and systems architecture.Programming: Strong in Python, PySpark, and optionally Scala, with hands-on experience on big data platforms like Databricks.Cloud & DevOps:3+ years in Azure or AWS cloud environments.Familiar with CI/CD pipelines, GitHub/ADO, and containerization (e.g., Kubernetes).Databases:Proficiency in SQL tuning and optimization.Experience with MPP databases (Databricks, Redshift, Synapse, or Snowflake).Tools & Technologies:Azure Data Factory, Azure Databricks, Azure MLData profiling and quality toolsBusiness Intelligence tools such as PowerBIBonus:Azure Data Engineer certificationExposure to ML/AI techniques, metadata management, and data governance

Job Title

Company : Jupiter AI Labs ✔

Location : Dehradun, Uttarakhand

Created : 2025-05-02

Job Type : Full Time