Skip to Main Content

Job Title


Scala DATA Engineer


Company : VMC Soft Technologies, Inc


Location : Bengaluru, Karnataka


Created : 2026-02-23


Job Type : Full Time


Job Description

Job Title: Scala Data Engineer Location: Bengaluru Experience: 8+ IT experience Job Summary: · We are seeking a highly skilled and experienced Senior Scala Data Engineer to join our · dynamic data team. In this role, you will be instrumental in designing, developing, and · maintaining our next-generation data pipelines and platforms using Scala, Apache Spark, · and cloud-native technologies. You will work on challenging problems involving large-scale · data ingestion, transformation, and processing, contributing directly to our analytical · capabilities and product features.Note: We are looking for teh immediate Joiners who can join immediately plus the experience on Scala should me more than 6 years or 7 years its a mandate less than that will not be get entertained.Key Responsibilities: · Design & Development: Architect, build, and optimize robust, scalable, and · efficient data pipelines using Scala and Apache Spark (Spark Core, Spark SQL, Spark · Streaming). · Data Ingestion: Develop solutions for ingesting high-volume, high-velocity data · from various sources (e.g., relational databases, NoSQL databases, APIs, message · queues like Kafka, log files) into our data lake/warehouse. · Data Transformation: Implement complex data transformations, aggregations, and · feature engineering logic to prepare data for analytics, machine learning models, · and operational systems. · Performance Optimization: Identify and resolve performance bottlenecks in Spark · jobs and data pipelines, ensuring optimal resource utilization and execution times. · Data Quality & Governance: Implement data validation, monitoring, and alerting · mechanisms to ensure data accuracy, completeness, and consistency. Contribute to · data governance best practices. · Cloud Infrastructure: Leverage and optimize cloud services (e.g., AWS EMR/Glue, · Azure Databricks/Synapse, GCP DataProc/BigQuery) for data processing and · storage. · Automation & Orchestration: Design and implement automated workflows for · data pipelines using tools like Apache Airflow, AWS Step Functions, or similar. Required Qualifications: · Experience: 5+ years of professional experience in data engineering, with a strong · focus on building large-scale data solutions. · Scala Expertise: Proven advanced proficiency in Scala programming language. · Apache Spark: Deep hands-on experience with Apache Spark (Core, SQL, · Streaming) for batch and real-time data processing. · Cloud Platforms: Extensive experience with at least one major cloud provider · (AWS, Azure, or GCP) and their relevant data services (e.g., AWS S3, EMR, Glue, · Kinesis; Azure Data Lake, Databricks, Event Hubs; GCP GCS, DataProc, Pub/Sub). · Data Warehousing: Strong understanding of data warehousing concepts, · dimensional modeling (star/snowflake schemas), and ETL/ELT processes. · SQL: Expert-level SQL skills for data querying, manipulation, and optimization. · Distributed Systems: Experience working with distributed systems and · understanding of their challenges (consistency, fault tolerance, concurrency). · Version Control: Proficiency with Git and collaborative development workflows. Nice-to-Haves: · Streaming Technologies: Experience with real-time streaming platforms like · Apache Kafka, Apache Flink, or Kinesis. · Containerization & Orchestration: Experience with Docker, Kubernetes, and · container orchestration for Spark applications. · Data Orchestration Tools: Hands-on experience with Apache Airflow, Dagster, · Prefect, or similar workflow management tools. · NoSQL Databases: Experience with NoSQL databases such as Cassandra, MongoDB, · DynamoDB, or HBase. · Data Lakehouse/Modern DW: Experience with technologies like Delta Lake, · Apache Iceberg, Snowflake, Redshift, or BigQuery. · MLOps: Familiarity with MLOps principles and supporting data pipelines for · machine learning models. · CI/CD: Experience setting up and maintaining CI/CD pipelines for data engineering · projects. · Performance Tuning: Advanced knowledge of Spark performance tuning · techniques, including memory management, shuffle optimization, and data · partitioning strategies. ·Certifications:Relevant cloud (AWS Certified Data Analytics, Azure Data Engineer · Associate, GCP Professional Data Engineer) or Spark certifications.Thanks & Regards, Vibha Seth Technical Recruiter E-Mail: vibha@ Contact: 9935984975 LinkedIn:/in/vibha-seth-14337b241