Skip to Main Content

Job Title


Machine Learning Engineer


Company : Valiance Solutions


Location : Nashik, Maharashtra


Created : 2025-07-23


Job Type : Full Time


Job Description

About the Role: We are seeking an experienced MLOps Engineer to lead the deployment, scaling, and performance optimization of open-source Generative AI models on cloud infrastructure. You’ll work at the intersection of machine learning, DevOps, and cloud engineering to help productize and operationalize large-scale LLM and diffusion models.Key Responsibilities: Design and implement scalable deployment pipelines for open-source Gen AI models (LLMs, diffusion models, etc.). Fine-tune and optimize models using techniques like LoRA, quantization, distillation, etc. Manage inference workloads, latency optimization, and GPU utilization. Build CI/CD pipelines for model training, validation, and deployment. Integrate observability, logging, and alerting for model and infrastructure monitoring. Automate resource provisioning using Terraform, Helm, or similar tools on GCP/AWS/Azure. Ensure model versioning, reproducibility, and rollback using tools like MLflow, DVC, or Weights & Biases. Collaborate with data scientists, backend engineers, and DevOps teams to ensure smooth production rollouts. Required Skills & Qualifications: 5+ years of total experience in software engineering or cloud infrastructure. 3+ years in MLOps with direct experience in deploying large Gen AI models. Hands-on experience with open-source models (e.g., LLaMA, Mistral, Stable Diffusion, Falcon, etc.). Strong knowledge of Docker, Kubernetes, and cloud compute orchestration. Proficiency in Python and familiarity with model-serving frameworks (e.g., FastAPI, Triton Inference Server, Hugging Face Accelerate, vLLM). Experience with cloud platforms (GCP preferred, AWS or Azure acceptable). Familiarity with distributed training, checkpointing, and model parallelism. Good to Have: Experience with low-latency inference systems and token streaming architectures. Familiarity with cost optimization and scaling strategies for GPU-based workloads. Exposure to LLMOps tools (LangChain, BentoML, Ray Serve, etc.). Why Join Us: Opportunity to work on cutting-edge Gen AI applications across industries. Collaborative team with deep expertise in AI, cloud, and enterprise software. Flexible work environment with a focus on innovation and impact.