Skip to Main Content

Job Title


Senior Data Engineer


Company : Scientist Technologies


Location : Bengaluru, Karnataka


Created : 2025-07-25


Job Type : Full Time


Job Description

Location- Bangalore Experience- 8-10 Yrs Important Note: We are looking for immediate joiners who are based in Bangalore. Additional Benefit: A special Immediate Joiner Bonus will be offered to candidates who can join without delay. About Scientist Technologies: At Scientist Technologies, we believe in driving global progress through scientific innovation, engineering expertise, and policy collaboration. Our mission is to create solutions that empower businesses, societies, and public institutions to tackle the most pressing challenges of our time. We offer a dynamic, collaborative environment where your expertise contributes to groundbreaking projects that weave together science, business, and policy for equitable human progress. Overview: Seeking an experienced Senior PySpark Data Engineer to join our data engineering team. The ideal candidate will have extensive experience in building scalable data processing pipelines using PySpark, with a strong foundation in Python development and familiarity with Data Science concepts. You will be responsible for designing and implementing robust data solutions for personalization and analytics platforms. Key Responsibilities: Data Pipeline Development Design, develop, and maintain large-scale data processing pipelines using PySpark Build reusable data processing components and libraries Implement efficient ETL/ELT processes for batch and real-time data processing Software Engineering Excellence Develop high-quality, reusable Python libraries following object-oriented programming principles Implement comprehensive unit testing frameworks and maintain test coverage Apply software engineering best practices including code reviews, version control, and CI/CD Debug and troubleshoot complex data processing issues Data Science & Feature Engineering Collaborate with data scientists to implement feature engineering solutions for personalization systems Translate data science requirements into scalable production code Optimize algorithms for large-scale distributed processing Implement data quality checks and monitoring systems Key Qualifications: 8-10 years of experience in data engineering or related field Expert-level proficiency in PySpark and Apache Spark ecosystem Strong Python programming skills with deep understanding of OOP concepts Experience with distributed computing and big data technologies Solid understanding of data science concepts and machine learning fundamentals Proficiency in unit testing frameworks (pytest, unittest) Experience with SQL and database technologies Knowledge of cloud platforms (AWS, Azure, or GCP) Data Engineering Expertise Experience with data pipeline orchestration tools (Airflow) Understanding of data warehousing concepts and dimensional modeling Experience with containerization and orchestration (Docker, Kubernetes) Preferred Qualifications: Experience with personalization systems and recommendation engines Knowledge of MLOps and machine learning lifecycle management Experience with real-time analytics and event-driven architectures Technical Stack: Languages: Python, SQL Big Data: PySpark, Apache Spark, Hadoop ecosystem Cloud: Agnostic Databases: PostgreSQL, MySQL, MongoDB, Cassandra Streaming: Apache Kafka Orchestration: Apache Airflow Testing: pytest, unittest, mock Version Control: Git, GitLab/GitHub Containerization: Docker, Kubernetes