About the job Attention please! • Only short NP-less than 30 days accepted, kindly pay attention to this to save your efforts, thank you for understandingResponsibilities: Automate deployment and management processes for machine learning platforms using tools such as Ansible and Python. Deploy, monitor, and patch ML platform components, including Cloudera Data Science Workbench (CDSW), Docker containers, and Kubernetes clusters. Ensure high availability and reliability of ML infrastructure through proactive maintenance and regular updates. Develop and maintain comprehensive documentation for platform configurations, processes, and procedures. Troubleshoot and resolve platform issues, ensuring minimal downtime and optimal performance. Implement best practices for security, scalability, and automation within the ML platform ecosystem.Mandatory Skills Description: We are seeking a skilled ML Platform Engineer for automating, deploying, patching, and maintaining our machine learning platform infrastructure. You need to have hands-on experience with Cloudera Data Science Workbench (CDSW), Cloudera Data Platform (CDP), Docker, Kubernetes, Python, Ansible, GitLab, and MLOps best practices.- Hands-on experience with CDSW (Cloudera Data Science Workbench) or similar data science platform or similar ML/AI platforms. - Proficiency in containerization and orchestration using Docker and Kubernetes (AKS preferred) - Solid scripting and automation skills in Python and Ansible. - Experience with GitLab for source control and CI/CD automation. - Understanding of MLOps principles and best practices (deployment, monitoring, lifecycle management of ML workloads). - Familiar with patching, updating, and maintaining platform infrastructure. - Profound Unix knowledge - Excellent problem-solving skills and a collaborative approach to team projects.Nice-to-Have Skills Description: - Previous banking domain Experience. - Familiarity with Cloudera CDP ecosystem (beyond CDSW). - Knowledge of monitoring & observability tools (Prometheus, Grafana, ELK). - Exposure to Airflow, MLflow, or Kubeflow for workflow and ML lifecycle orchestration. - Cloud platform experience with Azure (AKS, networking, storage, monitoring).
Job Title
ML Platform Engineer( Only short NP-less than 30 days accepted, kindly pay attention to this to save