#DataPlatformEngineer #SRE #SiteReliabilityEngineering #DataSRE #PlatformEngineer #BigDataJobs #Kubernetes #EKS #GKE #DevOps #CloudEngineering #DataInfrastructure #DistributedSystems #LinuxJobs #PythonJobs #Terraform #InfrastructureAsCode #HiringNow #ImmediateJoiners #RemoteJob Title: Data Platform EngineerLocation: Remote, IndiaExperience: 5+ YearsJob Type: Full-TimeWork Timing: Multiple Shifts (UK / US EST / US PST)On-Call Support: Yes Start Date: ASAP (Immediate / Early Joiners Preferred)Job SummaryWe are seeking highly skilled and dedicated Data Platform Engineers to build, operate, and scale large-scale data services that power mission-critical global products. This role is ideal for engineers who excel at designing reliable systems, reducing toil through automation, and collaborating closely with development teams to deliver seamless, high-availability data platforms. If you love architecting distributed systems, solving complex infrastructure problems, and ensuring platforms “just work,” this role is for you.Role DescriptionOur client operates planet-scale Cloud Services infrastructure, and the Data Platform SRE team is responsible for managing large-scale data systems across both bare-metal and cloud environments.You will work in an embedded SRE model, partnering directly with engineering teams to ensure system reliability, optimize performance, architect distributed systems, and drive automation across the data stack. This includes observability, SLO-driven operations, incident management, and end-to-end reliability engineering for services used globally.You will work with a mix of open-source, vendor, and internally developed tools across globally distributed data centers.Key ResponsibilitiesPlatform Reliability & OperationsOperate and support production data platforms at scale across internal and public cloud environments.Perform capacity planning, performance testing, disaster recovery planning, and manage distributed systems.Participate in a 24x7 on-call rotation to ensure service availability.Embedded SRE CollaborationWork closely with partner engineering teams in a unified SRE model.Implement and enforce consistent incident management processes.Define and maintain user-journey–based SLOs with strong observability metrics.Data Platform EngineeringArchitect, deploy, tune, and troubleshoot Big Data ecosystem tools: Apache Spark, Flink, Airflow, Hive, Hadoop/HDFS, Trino, Druid, etc.Manage storage and coordination systems such as Cassandra, Zookeeper, Redis, and cloud/block/blob storage technologies.Automation & ToolingReduce operational toil through automation and tooling improvements.Author and release code using Go, Python, Java, or Scala.Leverage configuration management, IaC, and CI/CD pipelines.Infrastructure & Systems EngineeringWork with Kubernetes-based environments at scale (EKS/GKE preferred).Manage Linux systems, containers, virtualization, and networking components.Diagnose complex issues using strong scientific and analytical troubleshooting.Must-Have SkillsStrong hands-on Kubernetes experience.Experience with Amazon EKS and/or Google Kubernetes Engine (GKE).Proficiency in Python (basic to mid-level coding capabilities).Experience supporting Linux systems in production environments.Expertise in Big Data technologies: Spark, Flink, Airflow, Hive, Hadoop/HDFS, Trino, Druid, etc.High-availability systems architecture knowledge.Hands-on experience with Terraform or other Infrastructure-as-Code tools.Nice-to-Have SkillsJava programming experience.Experience with Pulumi (Infrastructure-as-Code).
Job Title
Platform Engineer