About the RoleThe Data Platform team at Chargebee builds and maintains scalable data systems that power internal analytics, business intelligence, and customer-facing data features.As a Data Engineer, you will work on building and maintaining reliable data pipelines across multiple layers of the data platform, including data ingestion, distributed processing, transformation, and data serving.You will collaborate closely with product engineers, analysts, and platform teams to ensure that data is ingested, processed, and made available efficiently for analytics and product use cases. This role provides opportunities to work with modern data technologies and distributed systems, solving real-world data challenges at scale.The team operates in a fast-paced and collaborative environment, building reliable and scalable infrastructure that supports Chargebee’s growing data needs.What You Will Work OnAs a Data Engineer, you will contribute to the development and evolution of Chargebee’s data platform. This includes working on systems responsible for ingesting large volumes of data, processing and transforming it using distributed systems like Apache Spark, and making it available for analytics and customer-facing products.The role provides exposure to:Large-scale data ingestion and processing pipelinesStreaming and event-driven architecturesDistributed data processing frameworksCloud-based data infrastructureBuilding and maintaining data lake and data warehousesBuilding data capabilities powering both internal analytics and customer-facing productsKey ResponsibilitiesDesign, build, and maintain data ingestion and processing pipelines across multiple layers of the data platform.Develop and optimize ETL/ELT workflows to support scalable and reliable data processing.Build and maintain distributed data processing jobs using frameworks such as Apache Spark.Work with streaming and queue-based systems such as Kafka to process event-driven data.Design efficient data models and transformations to support analytics and product use cases.Ensure high standards of data reliability, integrity, and performance across the data platform.Troubleshoot and debug production data pipelines and distributed systems.Collaborate with data analysts, product teams, and engineers to support evolving data requirements.Participate in code reviews, design discussions, and agile development processes.Document data pipelines, architecture decisions, and platform workflows.Minimum QualificationsBachelor’s degree in Computer Science, Mathematics, Engineering, or a related technical field, or equivalent practical experience.1–2 years of experience working with data processing workflows and pipelines.Experience working on production systems, including troubleshooting and debugging technical issues.Experience with distributed computing system such as Apace Spark.Experience using Git workflows in collaborative development environments.Strong knowledge of SQL.Proficiency in at least one programming language such as Java, Python, or Scala.Good understanding of data structures, data models, and relational database concepts.Experience with relational databases such as PostgreSQL, MySQL, or similar.Experience working in an Agile development environment.Good-to-Have QualificationsExperience working within the AWS ecosystem.Experience building data pipelines using Apache Spark.Familiarity with distributed data processing frameworks and large-scale data systems.Understanding of data warehouse and data lake architectures.Strong technical communication and problem-solving skills.Ability to investigate and debug issues across distributed systems.Knowledge / prior experience on open table formats like - deltalake, iceberg or hudi
Job Title
Data Engineer