Job Description

On a mission to make AI smarter, faster, and everywhere. If you live and breathe machine learning pipelines, cloud platforms, and deployment automation, we want you to join my client''s team of tech rebels. Theyre not looking for a mere mortals we want an MLOps guru ready to optimize, automate, and scale machine learning to the moon. What youll be doing: Design, develop, and optimize ML pipelines like a wizardensuring training, validation, and inference run smoothly at scale. Automate deployment of deep learning and generative AI models for real-time applications (because who wants to deploy manually?). Implement model versioning, rollbacks, and ensure seamless updates like a smooth operator. Deploy and manage ML models on cloud platforms (AWS, GCP, Azure) using your containerized magic (hello Docker and Kubernetes). Optimize real-time inference performancebring TensorRT, ONNX, and PyTorch to their full glory. Work with GPU acceleration, distributed computing, and parallel processing to make AI workloads faster than a rocket. Fine-tune models to slice latency and boost scalabilitybecause who likes slow models? Build and maintain CI/CD pipelines for ML models (GitHub Actions, Jenkins, ArgoCD) to make life easier. Automate retraining and deployment to ensure the AI is always learning (and never slacking). Develop monitoring solutions to track model drift, data integrity, and performance, because we dont believe in letting things slide. Stay on top of security, data privacy, and AI ethics standardsbecause we care about doing things right. What we need from you: 5+ years of experience in MLOps, DevOps, or AI model deployment (youve been around the block). Mastery of Python and ML frameworks like TensorFlow, PyTorch, and ONNX (you know these like the back of your hand). Youve deployed models using Docker, Kubernetes, and serverless architecturesits second nature to you. Hands-on experience with ML pipeline tools (ArgoWorkflow, Kubeflow, MLflow, Airflow)youve built some mean pipelines. Expertise in cloud platforms (AWS, GCP, Azure) and hosting AI/ML models like a pro. GPU-based inference acceleration experience (CUDA, TensorRT, NVIDIA DeepStream)you make inference fast! Solid background in CI/CD workflows, automated testing, and deploying ML models without a hitch. Real-time inference optimization? Check. Scalable ML infrastructure? Double check. Excellent technical judgmentyou can architect a system that works today and evolves tomorrow. Youre all about automationif you can script it, you will. Youve got a deep understanding of distributed systems and computing architectures. Self-driven with the ability to work independently and own it. Experience with Kubernetes, Docker, or microservices in generalthis is your bread and butter. BS or MS in Computer Science or equivalentbecause youve got the education (or experience) to back it up. Nice to have: Some CUDA programming skills (youve probably dabbled). Experience with LLMs and generative AI models in production. A bit of networking knowledge never hurt anyone. Familiarity with distributed computing frameworks (Ray, Horovod, Spark)you like to go big. Edge AI deployment experience (Triton Inference Server, TFLite, CoreML)because why not push the envelope? Why Them? Work with top-tier AI experts in a fast-growing startup. Flexibilitywork from anywhere, anytime, as long as you get stuff done. Competitive salary plus benefits (obviously). Learning culturewe provide opportunities to grow and expand your skills. A work hard, play hard culturebecause who says you cant have fun while crushing it?

Job Title

Company : Koda Staff

Location : Toronto, Ontario

Created : 2025-05-06

Job Type : Full Time