Job Description

About the RoleWe are looking for a highly skilled Senior Machine Learning Engineer to build and scale next-generation generative AI systems. This role sits at the intersection of machine learning and backend infrastructure, focusing on taking advanced models from experimentation to reliable, high-performance production systems.You will work on cutting-edge generative video and multimodal AI use cases, contributing to scalable, low-latency systems used by millions of users globally.Key ResponsibilitiesDesign, train, fine-tune, and evaluate generative and multimodal models (e.g., text-to-video, image-to-video, lip-sync, character consistency)Build and manage end-to-end ML pipelines, including data ingestion, preprocessing, training, evaluation, and model versioningDeploy and maintain scalable ML systems, including model serving, containerization, and GPU-optimized inferenceImplement MLOps best practices such as experiment tracking, model monitoring, drift detection, and A/B testingOptimize inference systems for low latency, high throughput, and cost-efficient GPU utilizationDevelop batching and caching strategies to meet production SLAsCollaborate with backend and platform teams to integrate ML services into distributed systemsContribute to long-term AI strategy, including foundational model training and fine-tuning pipelinesRequired Qualifications4–10 years of experience in Machine Learning or Applied ML EngineeringStrong fundamentals in deep learning, Transformers, and generative model architecturesHands-on experience with large-scale model training and fine-tuning (e.g., LoRA, full fine-tuning)Proven experience in deploying and scaling ML models in production environmentsStrong understanding of MLOps practices and tools (e.g., MLflow, Weights & Biases)Experience with model serving frameworks such as Triton, TorchServe, vLLM, or similarProficiency in Python and frameworks like PyTorchExperience working with cloud platforms (AWS, GCP, or Azure), including GPU provisioning and autoscalingAbility to work in fast-paced, ambiguous environments with cross-functional teamsPreferred QualificationsExperience with video generation, diffusion models, or multimodal architecturesFamiliarity with LoRA/IC-LoRA techniques for character or identity consistencyKnowledge of inference optimization techniques such as quantization (FP8/INT8), batching, and GPU memory managementExperience with audio/video systems (e.g., TTS, voice cloning, lip-sync pipelines)Background in media, OTT, or large-scale content platformsWhat We OfferCompetitive compensationOpportunity to work on cutting-edge AI products at scaleHigh-impact role with ownership across the ML lifecycleCollaborative and fast-paced work environmentContinuous learning and growth opportunities

Job Title

Company : Recro

Location : Pune, Maharashtra

Created : 2026-04-10

Job Type : Full Time