Key Responsibilities: • Design and implement predictive models and machine learning algorithms to solve healthcare-specific challenges • Analyze large, complex healthcare datasets including electronic health records (EHR) and claims data • Develop statistical models for patient risk stratification, treatment optimization, population health management, and revenue cycle optimization • Build models for clinical decision support, patient outcome prediction, care quality improvement, and revenue cycle optimization • Create and maintain automated data pipelines for real-time analytics and reporting • Work with healthcare data standards (HL7 FHIR, ICD-10, CPT, SNOMED CT) and ensure regulatory compliance • Develop and deploy models in cloud environments while creating visualizations for stakeholders • Present findings and recommendations to cross-functional teams including clinicians, product managers, and executives Qualifications required: • Bachelor's degree in data science, Statistics, Computer Science, Mathematics, or related quantitative field • At least 2 years of hands-on experience in data science, analytics, or machine learning roles • Demonstrated experience working with large datasets and statistical modeling • Proficiency in Python or R for data analysis and machine learning • Experience with SQL and database management systems • Knowledge of machine learning frameworks such as scikit-learn, TensorFlow, PyTorch • Familiarity with data visualization tools such as Tableau, Power BI, matplotlib, ggplot2 • Experience with version control systems (Git) and collaborative development practices • Strong foundation in statistics, hypothesis testing, and experimental design • Experience with supervised and unsupervised learning techniques • Knowledge of data preprocessing, feature engineering, and model validation • Understanding of A/B testing and causal inference methods. What You’ll Need to Be Successful (Required Skills): • Large Language Model (LLM) Experience: At least 2 years of hands-on experience working with pre-trained language models (GPT, BERT, T5) including fine-tuning, prompt engineering, and model evaluation techniques • Generative AI Frameworks: Proficiency with generative AI libraries and frameworks such as Hugging Face Transformers, Lang Chain, OpenAI API, or similar platforms for building and deploying AI applications • Prompt Engineering and Optimization: Experience designing, testing, and optimizing prompts for various use cases including text generation, summarization, classification, and conversational AI applications • Vector Databases and Embeddings: Knowledge of vector similarity search, embedding models, and vector databases (Pinecone, We aviate, Chroma) for building retrieval-augmented generation (RAG) systems • AI Model Evaluation: Experience with evaluation methodologies for generative models including BLEU scores, ROUGE metrics, human evaluation frameworks, and bias detection techniques • Multi-modal AI Systems: Familiarity with multi-modal generative models combining text, images, and other data types, including experience with vision-language models and cross-modal applications • AI Safety and Alignment: Understanding of responsible AI practices including content filtering, bias mitigation, hallucination detection, and techniques for ensuring AI outputs align with business requirements and ethical guidelines
Job Title
Data Scientist