Senior Data Scientist (R&D) - AI/ML for Private CreditDesignation -Member of Technical StaffLocation: BangaloreExperience: 5+ yearsMode of work: Work from OfficeAbout the RoleWe're looking for a Senior Data Scientist to lead R&D initiatives at the intersection of LLMs, information retrieval, and private credit analytics. You'll fine-tune small language models on financial documents, build agentic workflows for multi-step reasoning, and develop production-ready extraction systems that power our AI platform for institutional investors.This role bridges cutting-edge research with real-world deployment. You'll work closely with Prompt Engineers on hybrid LLM+ML approaches, partner with QA Data on evaluation frameworks, and translate research into detailed specs for our Platform Engineering team. Your models will process thousands of credit agreements daily, requiring both innovation and reliability.What You'll DoModel Development & Fine-tuningFine-tune Small Language Models on proprietary private credit corpus (credit agreements, indentures, term sheets)Develop information retrieval systems: semantic search, ranking algorithms, and context-aware retrievalBuild agentic workflows with multi-step reasoning, tool use, reflection, and self-correction capabilitiesTrain classification models for document type identification, section detection, and entity recognitionCreate extraction models: NER for financial entities, relation extraction, structured table parsingResearch & InnovationPartner with Prompt Engineers on prompt optimization strategies and hybrid LLM+ML approachesExperiment with latest techniques: RAG architectures, fine-tuning methods (LoRA, QLoRA), model distillationPresent research findings to engineering team and stakeholders monthly (progress, insights, recommendations)Stay current with academic research and industry developments in NLP, LLMs, and financial MLProduction Readiness & DeploymentWrite detailed technical specs for Platform team: model architecture, dependencies, deployment steps, API contractsDefine production readiness criteria: performance benchmarks, edge case handling, failover mechanisms, rollback proceduresCreate comprehensive model cards: intended use, limitations, bias analysis, performance metrics, monitoring requirementsOptimize models for production constraints: latency 95%, cost Evaluation & Quality AssuranceWork with QA Data Teams on model evaluation frameworks and benchmark dataset creationBuild evaluation frameworks with offline metrics (accuracy, precision, recall, F1) and online metrics (user feedback, business impact)Create benchmark datasets: 1K+ examples per task with expert annotations and inter-annotator agreement analysisDefine task-specific success criteria tied to business outcomes (e.g., covenant extraction accuracy → analyst time savings)Monitoring & Continuous ImprovementMonitor model performance in production: accuracy drift, latency degradation, error patterns, user feedback loopsInvestigate performance degradation: is it data drift, concept drift, or infrastructure issues?Retrain models quarterly with new data, improved techniques, and expanded coverage of edge casesMaintain model performance dashboards and alert systems for critical degradationRequired QualificationsTechnical Expertise5+ years experience in ML/NLP with 2+ years focused on LLMs and transformersStrong hands-on experience with fine-tuning language models (BERT, RoBERTa, GPT-style models, LLaMA/Mistral)Expertise in information retrieval: vector databases (Pinecone, Weaviate, Qdrant), embedding models, semantic searchProduction ML deployment experience: model serving (TensorFlow Serving, TorchServe, ONNX), monitoring, A/B testingProficiency in Python ML stack: PyTorch/TensorFlow, Hugging Face, LangChain, scikit-learn, pandasDomain & Problem-SolvingExperience with document processing and extraction tasks (OCR pipelines, layout analysis, table extraction)Ability to translate vague business requirements into concrete ML problem statementsTrack record of moving models from research/prototype to production with measurable impactStrong understanding of evaluation methodology: offline vs online metrics, statistical significance testingCollaboration & CommunicationExperience writing technical documentation for engineering teams (architecture docs, API specs, runbooks)Ability to present complex technical concepts to non-technical stakeholdersComfortable working in cross-functional teams with prompt engineers, platform engineers, and QA analystsPreferred QualificationsExperience in financial services, credit analysis, or FinTech (private credit, leveraged finance, structured products)Familiarity with agentic frameworks: LangGraph, AutoGPT, ReAct patterns, tool-calling workflowsKnowledge of model compression techniques: quantization, pruning, knowledge distillationExperience with MLOps tools: MLflow, Weights & Biases, DVC, feature storesUnderstanding of financial document structures: credit agreements, indentures, term sheets, prospectusesPublications or patents in NLP, information extraction, or document understandingWho We Are-Alphastream.ai envisions a dynamic future for the financial world, where innovation is propelled by state-of-the-art AI technology and enriched by a profound understanding of credit and fixed-income research. Our mission is to empower asset managers, research firms, hedge funds, banks, and investors with smarter, faster, and curated data. We provide accurate, timely information, analytics, and tools across simple to complex financial and non-financial data, enhancing decision-making. With a focus on bonds, loans,financials and sustainability, we offer near real-time data via APIs and PaaS (Platform as a Service) solutions that act as the bridge between our offerings and seamless workflow integration.To learn more about us: we offer/"At Alphastream.ai we offer a dynamic and inclusive workplace where your skills are valued and your career can flourish. Enjoy competitive compensation, a comprehensive benefits package, and opportunities for professional growth. Immerse yourself in an innovative work environment, maintain a healthy work-life balance, and contribute to a diverse and inclusive culture. Join us to work with cutting-edge technology, and be part of a team that recognizes and rewards your achievements, all while fostering a fun and engaging workplace culture./"Disclaimer-Alphastream.ai is an equal opportunities employer. We work to provide a supportive and inclusive environment where all individuals can maximize their full potential. Our skilled and creative workforce is comprised of individuals drawn from a broad cross section of all communities in which we operate and who reflect a variety of backgrounds, talents, perspectives, and experiences. Our strong commitment to a culture of inclusion is evident through our constant focus on recruiting, developing, and advancing individuals based on their skills and talents.
Job Title
Senior Data Scientist (R&D) - AI/ML for Private Credit