Reporting to the Sr. Manager of Data Science, this role will establish and lead Xplores enterprise machine learning platform based on Azure Databricks and MLOps practices to power trusted, scalable AI across the business. You will define the endtoend ML lifecycle (data/feature pipelines, experimentation, model training, evaluation, registration, deployment, and monitoring), enable online/offline feature stores, and embed governance, explainability, and compliance by design. Key responsibilities Own the ML platform architecture and delivery to support a broad range of AI/ML use cases (e.g., customer experience, marketing optimization, network/OSS advanced analytics, fraud/risk, forecasting), spanning batch and realtime inference. Stand up robust feature engineering and model training pipelines integrated with the enterprise lakehouse and streaming systems (e.g., Azure Databricks, Spark/Ray, Kafka/Kinesis, SQL Server/Oracle sources, Salesforce via APIs/CDC), ensuring reproducibility and lineage from data to model artifact. Implement MLOps at scale: experiment tracking, model registry, approval workflows, automated training/retraining, CI/CD for ML, and environment parity across dev/test/prod using infrastructure as code (e.g., Terraform) and Git-based workflows. Deploy and operate low-latency model serving for APIs and eventdriven architectures (realtime inference, streaming features), including canary, shadow, and championchallenger patterns and A/B testing frameworks. Establish ML governance and responsible AI controls: model documentation/cards, explainability, bias and fairness testing, humanintheloop review, versioning, and approvals aligned to Canadian regulations (e.g., PIPEDA/CPPA, Quebec Law25) and internal risk/compliance standards. Build ML observability endtoend: define and enforce model SLAs/SLOs (latency, availability), and monitor data quality, concept drift, performance decay, and data/feature freshness; set automated alerts and rollback/disable strategies. Stand up and curate an enterprise feature store supporting both offline analytics and online serving; manage feature contracts, ownership, and reuse to accelerate model delivery. Partner closely with data engineering to ensure quality, welldocumented, fitforpurpose datasets and MLOpsfriendly patterns; collaborate with architecture, InfoSec, privacy, legal, and compliance to translate policy into technical controls and support audits/evidence collection. Lead engineering best practices: testfirst ML (unit/integration/data tests), performance tuning, cost governance for training and serving, model packaging and dependency hygiene, secrets and key management. Create reference architectures, standards, and reusable frameworks (e.g., training/inference templates, evaluation harnesses, reproducible pipelines); mentor data scientists and champion securityfirst, privacybydesign development. Engage business stakeholders to shape the AI roadmap, translate use cases into measurable ML products, and quantify value delivered (e.g., lift, ROI, operational efficiency). The ideal candidate 10+ years of applied ML or adjacent data/AI experience, including 4+ years leading platform or teamlevel initiatives and shipping models to production at scale. Expert, handson experience with ML platforms and distributed compute (e.g., Spark/Ray ecosystems; Kubernetes for training/serving; cloudnative ML services and storage; batch + streaming). Strong MLOps track record with experiment tracking, model registries, feature stores, and automated CI/CD pipelines (e.g., MLflow/Kubeflow/SageMaker/Vertex/Databricks, Argo/GitHub Actions/Azure DevOps). Proficiency in Python for ML and data engineering (packaging, testing, virtual environments), advanced SQL, and performance tuning for largescale ETL/ELT and model training. Deep familiarity with modern ML frameworks (e.g., scikitlearn, XGBoost/LightGBM, PyTorch, TensorFlow), model evaluation/validation, and productiongrade inference patterns (REST/gRPC, streaming). Practical experience integrating enterprise systems: ingesting from Salesforce, onprem RDBMS, files/telemetry, OSS/BSS, and APIs; experience bridging onprem systems with cloud platforms. Solid grasp of security and privacy controls for ML/data (RBAC/ABAC, encryption, tokenization/masking), secrets/key management, and familiarity with Canadian privacy regimes (PIPEDA/CPPA, Quebec Law25) and other relevant standards. Experience with observability (metrics, logs, traces) and MLspecific monitoring (data drift, concept drift, performance decay), and with SLOdriven operations (e.g., Prometheus/Grafana/CloudWatch, data observability tools). Infrastructure as code and testing expertise: Terraform, containerization (Docker), IaC for ML infra, automated testing frameworks (unit, integration, data/feature tests), and reproducible environments. Comfort collaborating across domains (Network, Care, Marketing, Finance, Sales) and translating business needs into scalable, compliant ML products with clear value metrics. Advanced academic background in Computer Science, Engineering, Mathematics, or a related quantitative field (graduate degree preferred). Excellent communication skills, a coaching mindset, and the ability to set platform vision and deliver iteratively. #J-18808-Ljbffr
Job Title
Principal Data Scientist