Skip to Main Content

Job Title


Senior Data Pipeline Engineer (Python/Airflow)


Company : Global Connect Technologies


Location : Guadalajara, Mexico Metropolitan Area


Created : 2025-11-29


Job Type : Full Time


Job Description

Role Summary Own the design and operation of reliable, secure, and cost‑efficient data pipelines built with Apache Airflow (2.x) and Python. You’ll deliver batch and streaming ingestion, transformations, and curated datasets that power connected and infotainment experiences. You’ll lead Python-based ETL/ELT, DAG orchestration, and data platform reliability, security, and observability across our cloud environments. We are an Android app development team looking for someone to own and lead our cloud data engineering in close partnership with mobile, backend, and product teams. This ownership would include taking abstract requirements, refining/defining them, creating development tasks, and then implementing those tasks. Responsibilities • Design, build, and maintain Airflow DAGs using TaskFlow, dynamic DAGs, deferrable operators, providers, and the Secrets backend; manage cross‑DAG dependencies and SLAs. • Develop Python ETL/ELT code to ingest from APIs, object storage, message buses, and databases; package code as reusable libraries. • Operate Airflow on managed or self‑hosted platforms (e.g., Azure, Kubernetes deployments); implement blue/green or canary DAG releases. • Implement data quality and testing with unit tests for operators/hooks, and DAG validation in CI. • Build event‑driven pipelines for near‑real‑time processing; manage schemas and compatibility. • Model and manage data stores across SQL and blob storage; design partitioning, clustering, and retention. • Observability & lineage: instrument metrics/logs, set SLAs/alerts, drive post‑incident reviews and reliability improvements. • Security & governance: apply least‑privilege IAM, secrets management, PII handling, and data contracts; enforce RBAC in Airflow and warehouses. • CI/CD & IaC: build pipelines to lint/test/deploy DAGs and Python packages; provision infra with Terraform/Helm; containerize with Docker. • Cost & performance: tune task parallelism, autoscaling, storage formats, and compute footprints to optimize cost/perf. • Collaboration: work closely with Android/backend teams to define interfaces and data contracts; document decisions and operational runbooks. Skills and Qualifications • 8+ years in data engineering or backend engineering with strong Python expertise. • 2+ years Airflow 2.x expertise (operators, hooks, sensors, TaskFlow, scheduler tuning). • Proven experience designing reliable ETL/ELT at scale (batch and streaming) with robust testing and monitoring. • Strong SQL and data modeling skills; hands‑on with one or more data warehouses (BigQuery, Redshift, Snowflake) and relational systems (PostgreSQL/MySQL). • Familiarity with security best practices (RBAC, OAuth2/OIDC for service integrations), API gateways, and secrets management (Vault/AWS Secrets Manager/GCP Secret Manager). • Comfortable operating in production: monitoring, troubleshooting, and performance tuning. • Excellent written and verbal communication; clear trade‑off communication and autonomous execution with well‑documented decisions. Nice to Have • Proficient with CI/CD, Git, code reviews, and Infrastructure as Code (Terraform); containerization with Docker and orchestration on Kubernetes is a plus. • Spark (PySpark) or Beam for large‑scale processing. • Automotive/IoT telemetry domain exposure. • Experience with Kafka (or Event Hubs/Pub/Sub equivalents), schema registry, and CDC patterns. • dbt for transformations and testing; Delta Lake/medallion patterns; feature stores. Logistics • Contract role; collaborate across multi‑disciplinary teams in a fast‑moving, target‑oriented environment. • Motivated team player with high attention to detail and a creative, pragmatic approach to problem solving.