Key Responsibilities: • Design, implement, and optimize scalable data pipelines using Databricks and Apache Spark. • Architect data lakes using Delta Lake, ensuring reliable and efficient data storage. • Manage metadata, security, and lineage through Unity Catalog for governance and compliance. • Ingest and process streaming data using Apache Kafka and real-time frameworks. • Collaborate with ML engineers and data scientists on LLM-based AI/GenAI project pipelines. • Apply CI/CD and DevOps practices to automate data workflows and deployments (e.g., with GitHub Actions, Jenkins, Terraform). • Optimize query performance and data transformations using advanced SQL. • Implement and uphold data governance, quality, and access control policies. • Support production data pipelines and respond to issues and performance bottlenecks. • Contribute to architectural decisions around data strategy and platform scalability.Required Skills & Experience: • 5+ years of experience in data engineering roles. • Proven expertise in Databricks, Delta Lake, and Apache Spark (PySpark preferred). • Deep understanding of Unity Catalog for fine-grained data governance and lineage tracking. • Proficiency in SQL for large-scale data manipulation and analysis. • Hands-on experience with Kafka for real-time data streaming. • Solid understanding of CI/CD, infrastructure automation, and DevOps principles.
Job Title
Senior Data Engineer