Data Scientist - Clinical Data Extraction & AI Integration Experience Level: 3-6 Years Location: Chennai/Hybrid Employment Type: Full-time About the Role We are seeking an experienced Data Scientist to join our healthcare technology team, focusing on medical document processing and data extraction systems. You'll be working with cutting-edge AI technologies to build robust solutions that extract critical information from clinical documents, improving healthcare data workflows and patient care outcomes. Key Responsibilities Data Science & Analytics Design and implement statistical models for medical data quality assessment Develop predictive algorithms for encounter classification and validation Build machine learning pipelines for document pattern recognition Create data-driven insights from clinical document structures Implement feature engineering for medical terminology extraction Advanced Analytics & ML Apply natural language processing (NLP) techniques to clinical text Develop statistical validation frameworks for extracted medical data Build anomaly detection systems for medical document processing Create predictive models for discharge date estimation and encounter duration Implement clustering algorithms for provider and encounter classification AI & LLM Integration Integrate and optimize Large Language Models via AWS Bedrock and API services Design and refine AI prompts for clinical content extraction with high accuracy Implement fallback logic and error handling for AI-powered extraction systems Develop pattern matching algorithms for medical terminology Create validation layers for AI-extracted medical information Healthcare Domain Expertise Work with medical document structures Implement healthcare-specific validation rules Handle medical terminology extraction and clinical context analysis Ensure HIPAA compliance and data security best practices Technologies & Tools Languages: Python 3.8+, R, SQL, JSON Data Science Stack: pandas, numpy, scipy, scikit-learn, spaCy, NLTK ML Frameworks: TensorFlow, PyTorch, transformers, huggingface Visualization: matplotlib, seaborn, plotly, Tableau, PowerBI AI Platforms: AWS Bedrock, Anthropic Claude, OpenAI APIs Cloud Services: AWS (SageMaker, S3, Lambda, Bedrock) Research Tools: Jupyter notebooks, Git, Docker, MLflow
Job Title
Data Scientist - Clinical Data Extraction & AI Integration