Job description: Experience ● Experience level of 4 to 7 years in data engineering, data warehousing, or a related field. ● Experience with dashboarding tools like plx dashboard and looker studio ● Experience with building data pipelines, reports, best practices and frameworks. ● Experience with design and development of scalable and actionable solutions (dashboards, automated collateral, web applications). ● Experience with code refactoring for optimal performance. ● Experience writing and maintaining ETLs which operate on a variety of structured and unstructured sources. ● Familiarity with non-relational data storage systems (NoSQL and distributed database management systems). Skills ● Strong proficiency in SQL, NoSQL, ETL tools, BigQuery and at least one programming language (e.g., Python, Java). ● Big Query,Data Flow,Data Proc,Cloud Sql,Teraform etc ● Strong understanding of data structures, algorithms, and software design principles. ● Experience with data modeling techniques and methodologies. ● Proficiency in troubleshooting and debugging complex data-related issues. ● Ability to work independently and as part of a team. Responsibilities: ● Data Pipeline Development: Design, implement, and maintain robust and scalable data pipelines to extract, transform, and load data from various sources into our data warehouse or data lake. ● Data Modeling and Warehousing: Collaborate with data scientists and analysts to design and implement data models that optimize query performance and support complex analytical workloads. ● Cloud Infrastructure: Leverage Google Cloud and other internal storage platforms to build and manage scalable and cost-effective data storage and processing solutions. ● Data Quality Assurance: Implement data quality checks and monitoring processes to ensure the accuracy, completeness, and consistency of data. · Build large scale data and analytics solutions on GCP, Efficiently use the GCP platform to integrate large datasets from multiple data sources, analyse data, data modelling, data exploitation/visualization, DevOps, CI/CD implementation Build automated data pipelines and work in Data engineering solution on GCP using Cloud BigQuery, Cloud DataProc, ● ● Performance Optimization: Continuously monitor and optimize data pipelines and queries for performance and efficiency. ● Collaboration: Work closely with data scientists, analysts, and other stakeholders to understand their data needs and deliver solutions that meet their requirements.Desirable · Experience Cloud Storage or equivalent cloud platforms · Knowledge of BigQuery ingress and egress patterns · Experience in writing Airflow DAGs · Knowledge of pubsub,dataflow or any declarative data pipeline tools using batch and streaming ingestion · Other GCP Services: Vertex AI
Job Title
Google Data Engineering