Job Title: ML Ops Engineer / ML Engineer Experience - 5Yrs -10 Yrs Location - ChennaiJob Overview: We are looking for an experienced MLOps Engineer to help deploy, scale, and manage machine learning models in production environments. You will work closely with data scientists and engineering teams to automate the machine learning lifecycle, optimize model performance, and ensure smooth integration with data pipelines.Key Responsibilities: Transform prototypes into production-grade models Assist in building and maintaining machine learning pipelines and infrastructure across cloud platforms such as AWS, Azure, and GCP. Develop REST APIs or FastAPI services for model serving, enabling real-time predictions and integration with other applications. Collaborate with data scientists to design and develop drift detection and accuracy measurements for live models deployed. Collaborate with data governance and technical teams to ensure compliance with engineering standards. Maintain models in production Collaborate with data scientists and engineers to deploy, monitor, update, and manage models in production. Manage the full CI/CD cycle for live models, including testing and deployment. Develop logging, alerting, and mitigation strategies for handling model errors and optimize performance. Troubleshoot and resolve issues related to ML model deployment and performance. Support both batch and real-time integrations for model inference, ensuring models are accessible through APIs or scheduled batch jobs, depending on use case. Contribute to AI platform and engineering practices Contribute to the development and maintenance of the AI infrastructure, ensuring the models are scalable, secure, and optimized for performance. Collaborate with the team to establish best practices for model deployment, version control, monitoring, and continuous integration/continuous deployment (CI/CD). Drive the adoption of modern AI/ML engineering practices and help enhance the team’s MLOps capabilities. Develop and maintain Flask or FastAPI-based microservices for serving models and managing model APIs.Minimum Required Skills: Bachelor's degree in computer science, analytics, mathematics, statistics. Strong experience in Python, SQL, Pyspark. Solid understanding and knowledge of containerization technologies (Docker, Podman, Kubernetes). Proficient in CI/CD pipelines, model monitoring, and MLOps platforms (e.g., AWS SageMaker, Azure ML, MLFlow). Proficiency in cloud platforms, specifically AWS, Azure and GCP. Familiarity with ML frameworks such as TensorFlow, PyTorch, Scikit-learn. Familiarity with batch processing integration for large-scale data pipelines. Experience with serving models using FastAPI, Flask, or similar frameworks for real-time inference. Certifications in AWS, Azure or ML technologies are a plus. Experience with Databricks is highly valued. Strong problem-solving and analytical skills. Ability to work in a team-oriented, collaborative environment.Tools and Technologies: Model Development & Tracking:TensorFlow, PyTorch, scikit-learn, MLflow, Weights & Biases Model Packaging & Serving:Docker, Kubernetes, FastAPI, Flask, ONNX, TorchScript CI/CD & Pipelines:GitHub Actions, GitLab CI, Jenkins, ZenML, Kubeflow Pipelines, Metaflow Infrastructure & Orchestration:Terraform, Ansible, Apache Airflow, Prefect Cloud & Deployment:AWS, GCP, Azure, Serverless (Lambda, Cloud Functions) Monitoring & Logging:Prometheus, Grafana, ELK Stack, WhyLabs, Evidently AI, Arize Testing & Validation:Pytest, unittest, Pydantic, Great Expectations Feature Store & Data Handling:Feast, Tecton, Hopsworks, Pandas, Spark, Dask Message Brokers & Data Streams:Kafka, Redis Streams Vector DB & LLM Integrations (optional):Pinecone, FAISS, Weaviate, LangChain, LlamaIndex, PromptLayer
Job Title
Machine Learning Engineer