Job Description

Scribie is an AI-powered, Human Verified audio and video transcription service, trusted globally since 2008. We specialize in delivering accurate and reliable transcription solutions by blending advanced AI technology with human expertise. Headquartered in the US, we operate with a hybrid model in our Bangalore office, combining the flexibility of remote work with the collaboration of in-person engagement. This approach offers our team both autonomy and growth opportunities in a dynamic and supportive environment.We’re building production-grade audio foundation models for high-stakes legal and enterprise transcription — real customer data, messy audio, real consequences.This is not a paper-only research role.You’ll own the full ML lifecycle:Fine-tuning large audio / multimodal models using SFT, LoRA, and RL-based preference optimization (DPO / PPO / ORPO)Beating strong baselines like Whisper-large, GPT-4o, Gemini, Claude on domain-specific dataDesigning WER, diarization, and alignment-driven evaluation stacksTaking models from research notebooks → production inference servicesRunning daily experiments that directly impact quality, cost, and customer satisfactionYou’ll be our first ML hire, with real ownership over the audio ML roadmap — not a side project, not a support role.

Job Title

Company : Scribie

Location : Bangalore, Karnataka

Created : 2026-01-28

Job Type : Full Time