Job Title: Data Engineer / Senior Data EngineerLocation: BangaloreExperience: 5+ yearsJob Type: (Hybrid, Fulltime)Immediate joiners or notice period of less than 10 days are neededPurpose:As a Data Engineer at LogixHealth, you will work with a globally distributed team of engineers to design and build cutting edge solutions that directly improve the healthcare industry. You’ll contribute to our fast-paced, collaborative environment and bring your expertise to continue delivering innovative technology solutions, while mentoring others. Duties and Responsibilities: Contribute to the creation of a self-service data platform for reporting and analytics Design and build data solutions using Databricks, SQL, Python, Spark, and Delta Lake in the Azure ecosystem (Blob Storage, Data Factory, Event Hubs) Adhere to best practices of ETL / ELT processes (data quality management, data processing, data partitioning, maintainability and reusability) Collaborate with engineers, product, and business leaders to ensure data platform is integrated with other systems and technologies (Tableau, Power BI, APIs, custom applications) Establish CI/CD processes, test frameworks, infrastructure-as-code tools, and monitoring/alerting (Git, Terraform, Azure DevOps / GitHub Actions / Jenkins, Azure Monitor / Datadog) Adhere to the Code of Conduct and be familiar with all compliance policies and procedures stored in LogixGarden relevant to this position Qualifications:To perform this job successfully, an individual must be able to perform each duty satisfactorily. The requirements listed below are representative of the knowledge, skills, and/or ability required. Reasonable accommodation may be made to enable individuals with disabilities perform the duties. Education (Degrees, Certificates, Licenses, Etc.):BS (or higher, MS / PhD) degree in Computer Science / related field, or equivalent technical experience. Experience:5+ years of strong hands-on experience in Apache Spark and Databricks, building scalable data pipelines and distributed data processing systems in cloud environmentsDeep expertise in Databricks ecosystem including: Delta LakeDelta Live Tables (DLT)Unity CatalogWorkflow orchestration (Jobs)Strong programming experience in PySpark / Spark (Python or Scala preferred) for large-scale data engineering workflowsProven experience designing high-performance Spark jobs, optimization techniques (partitioning, caching, AQE, joins, skew handling)Experience integrating Databricks with (good to have): Azure Data FactoryEvent Hubs / streaming pipelinesExternal orchestration tools like AirflowWorking knowledge of cloud data platforms (Azure preferred) including Blob Storage, NoSQL DB'sExperience with relational databases (MS SQL, PostgreSQL, MySQL) is good to have.Exposure to data governance, security, and compliance (Unity Catalog, RBAC, data lineage) Core Skills (Needed):Expert-level Spark (PySpark/Scala): DataFrames, Spark SQL, Structured StreamingPerformance tuning & debuggingHandling large-scale datasets (TB+ scale)Databricks Expertise: Notebooks, Jobs, WorkflowsDelta Lake (ACID, schema evolution, optimization)Delta Live Tables (pipeline design & orchestration)Unity Catalog (data governance, access control)Data Engineering on Databricks: Batch + streaming pipelinesMedallion architecture (Bronze/Silver/Gold)Incremental processing & CDC patterns
Job Title
Data Engineer