Mandatory skills:* Cloud-GCP, Azure Data Lake* Python-Airflow* SQL* Good Communication SkillsAdditional Information:* Experience: 5-8 Yrs* Shift Timings: 11:30 AM to 8:30 PMIncident Management & Triage:* Monitor the health and performance of data pipelines built using Cloud Composer,Dataflow/Airflow, and Cloud Data Fusion.* Serve as the first point of contact for incidents related to data pipeline failures, delays,or data inconsistencies.* Perform initial investigation and triage of incidents to identify the potential root cause(e.g., pipeline logic, upstream data issues, downstream system unavailability, datadependency problems).* Log, prioritize, track, and manage incidents using standard ticketing systems.Troubleshooting & Root Cause Analysis:* Analyze pipeline execution logs, system metrics (within GCP), and data flows todiagnose complex issues./* Investigate pipeline failures specifically caused by dependencies, such as incomplete"raw data uploads" into Cloud Storage/BigQuery before ETL/transformation jobscommence.* Identify and analyze data quality issues, tracing discrepancies back to potential sourcesystem errors (e.g., PeopleSoft, Salesforce) or transformation logic flaws.* Utilize GCP tools (Cloud Monitoring, Logging, BigQuery query analysis) for diagnostics.Coordination & Escalation:* Collaborate closely with teams managing upstream systems (e.g., PeopleSoft DBAdmins, Salesforce Admins) and downstream applications when issues originateoutside the core data platform.* Escalate unresolved or infrastructure-related issues (Network, Core Compute, IAMbeyond application access) to the dedicated Cloud-Ops team.* Work with the MuleSoft integration team to troubleshoot issues related to data flowthrough the integration layer.* Clearly communicate incident status, impact, and resolution steps to stakeholders.Maintenance & Operational Tasks:* Perform routine checks and minor corrective actions to ensure pipeline stability.* Assist in managing dependencies between data loading processes and ETL/ELTpipelines.* Maintain and update operational runbooks, knowledge base articles, and supportdocumentation.* Participate in post-incident reviews to identify preventative measures.Data Lake Support:* Monitor data loading processes into BigQuery and Cloud Storage.* Perform basic queries in BigQuery to validate data presence, investigate quality issues,or support troubleshooting efforts.Required Qualifications & Skills:* Proven experience in an Application Support, Production Support, or Site ReliabilityEngineering (SRE) role, preferably focused on data platforms or ETL/ELT processes.* Hands-on experience supporting and troubleshooting applications/pipelines within theGoogle Cloud Platform (GCP) environment.Working knowledge of GCP data services:* Orchestration: Cloud Composer (Airflow)* Processing: Dataflow and/or Cloud Data Fusion* Storage/Datalake: BigQuery and Cloud Storage* Strong SQL skills for data querying and analysis (especially within BigQuery).* Understanding of ETL/ELT concepts, data warehousing principles, and data pipelinedependencies.* Excellent analytical and problem-solving skills with a systematic approach totroubleshooting.* Experience with incident management processes and tools (e.g., JIRA, ServiceNow).* Basic understanding of cloud networking concepts (VPC, Firewalls) and IAM principleswithin GCP for context during troubleshooting.* Strong communication skills (verbal and written) with the ability to explain technicalissues to both technical and non-technical audiences.* Ability to work effectively both independently and as part of a distributed team.
Job Title
GCP Data Lake