About this JobEcoRatings is seeking a robust and detail-oriented Data Engineer to build the critical data pipelines that ingest massive datasets from enterprise systems into our proprietary AI ecosystem.As a core member of the Intelligence & Data track, you will bridge the gap between raw enterprise data (Oracle, SQL, FTP) and our generative AI models. This role is ideal for an engineer who thrives on solving complex data ingestion challenges and ensuring data integrity in high-stakes environments.Responsibilities- Data Ingestion & ETL: Design, develop, and maintain scalable ETL/ELT pipelines to ingest data from diverse enterprise sources, including SQL databases, Oracle ERPs, and flat files (FTP/SFTP). - Pipeline Orchestration: Build and manage data workflows using tools like Apache Airflow or Prefect to ensure timely, reliable data delivery to the AI/ML team. - Schema Design: Collaborate with the Lead AI Engineer to define and implement data schemas that optimize the performance of RAG (Retrieval-Augmented Generation) pipelines. - Database Management: Optimize and manage both relational (PostgreSQL/MySQL) and non-relational storage solutions, specifically Vector databases like Pinecone or Weaviate. - Data Cleaning & Validation: Implement automated data validation and cleaning scripts to ensure that the "intelligence" layer receives high-quality, audit-ready data. - API Development: Build and maintain internal APIs and connectors that facilitate seamless communication between the data warehouse - Security & Compliance: Ensure all data pipelines adhere to strict enterprise security protocols, including encryption at rest and in transit, to protect sensitive client information. - Collaboration: Work closely with the Full Stack Lead and AI Engineers to synchronize data ingestion logic with application requirements and AI "thinking" processes.Qualifications- Education: Bachelor’s or Master’s degree in Data Science, Computer Science, Information Technology, or a related quantitative field. - Technical Proficiency: Advanced expertise in Python and SQL. Deep understanding of database internals and query optimization. - Enterprise Integration: Proven experience in connecting to and extracting data from enterprise-grade systems like Oracle, SAP, or Microsoft Dynamics. - Data Engineering Tools: Hands-on experience with modern data stack tools (e.g., dbt, Airflow, Snowflake, or Databricks). - Cloud Infrastructure: Strong familiarity with AWS (S3, Redshift, Glue) or Azure data services. - Big Data Frameworks: (Optional but preferred) Knowledge of Spark or Flink for processing large-scale environmental datasets. - Version Control: Proficiency in Git and experience working in an Agile development environment. - Problem Solving: A systematic approach to debugging complex data flows and a commitment to data accuracy and reliability.
Job Title
Data Engineer