: AI Data Engineer : Full-Time with Cogent IBS : Remote (primarily) (secondary) : • Design and implement the Agentic RAG architecture for large-scale data standardization. • Build scalable ETL pipelines using PySpark and Databricks for processing 1.2B+ records. • Design, develop, and optimize Databricks jobs, workflows, and clusters. • Leverage Delta Lake for reliable, scalable data storage and processing. • Integrate vector databases, LLM orchestration, and external catalog APIs (MTP, PCdb, VCdb). • Implement confidence scoring, retry logic, and human-in-the-loop workflows. • Optimize performance, cost, and scalability across distributed pipelines. • Ensure auditability, lineage tracking, and data governance compliance. / : • Strong Python, PySpark, and SQL expertise. • Hands-on Databricks experience (notebooks, workflows, jobs, Delta Lake). • Experience with large-scale ETL and distributed data processing. • Working knowledge of LLMs, RAG, embeddings, and vector databases. • Cloud experience on AWS, Azure, or GCP. • Experience integrating REST APIs and external data catalogs. • Familiarity with data governance, monitoring, and logging frameworks. Interested candidates, please send your resume and mention in the subject line.
Job Title
AI Data Engineer