Job Description

Job Title: AI Engineer – Web Crawling & Field Data Extraction Location:[Remote] Department: Engineering / Data Science Experience Level: Mid to Senior Employment Type: Contract to HireAbout the Role:We are looking for a skilled AI Engineer with strong experience in web crawling, data parsing, and AI/ML-driven information extraction to join our team. You will be responsible for developing systems that automatically crawl websites, extract structured and unstructured data, and intelligently map the extracted content to predefined fields for business use. This role combines practical web scraping, NLP techniques, and AI model integration to automate workflows that involve large-scale content ingestion.Key Responsibilities: Design and develop automated web crawlers and scrapers to extract information from various websites and online resources. Implement robust and scalable data extraction pipelines that convert semi-structured/unstructured data into structured field-level data. Use Natural Language Processing (NLP) and ML models to intelligently interpret and map extracted content to specific form fields or schemas. Build systems that can handle dynamic web content, captchas, JavaScript-rendered pages, and anti-bot mechanisms. Collaborate with frontend/backend teams to integrate extracted data into user-facing applications. Monitor crawler performance, ensure compliance with legal/data policies, and manage scheduling, deduplication, and logging. Optimize crawling strategies using AI/heuristics for prioritization, entity recognition, and data validation. Create tools for auto-filling forms or generating structured records from crawled data.Required Skills and Qualifications: Bachelor’s or Master’s degree in Computer Science, AI/ML, Data Science, or related field. 3+ years of hands-on experience with web scraping frameworks (e.g., Scrapy, Puppeteer, Playwright, Selenium). Proficiency in Python, with experience in BeautifulSoup, lxml, requests, aiohttp, or similar libraries. Experience with NLP libraries (e.g., spaCy, NLTK, Hugging Face Transformers) to parse and map extracted data. Familiarity with ML-based data classification, extraction, and field mapping. Knowledge of structured data formats (JSON, XML, CSV) and RESTful APIs. Experience handling anti-scraping techniques and rate-limiting controls. Strong problem-solving skills, clean coding practices, and the ability to work independently. Nice-to-Have Experience with AI form understanding (e.g., LayoutLM, DocAI, OCR). Familiarity with Large Language Models (LLMs) for intelligent data labeling or validation. Exposure to data pipelines, ETL frameworks, or orchestration tools (Airflow, Prefect). Understanding of data privacy, compliance, and ethical crawling standards.Why Join Us? Work on cutting-edge AI applications in real-world automation. Be part of a fast-growing and collaborative team. Opportunity to lead and shape intelligent data ingestion solutions from the ground up.

Job Title

Company : ClientCurve

Location : Bikaner, Rajasthan

Created : 2025-08-01

Job Type : Full Time