Skip to Main Content

Job Title


AIOps Engineer


Company : Imaging IQ


Location : Davanagere, Karnataka


Created : 2025-12-19


Job Type : Full Time


Job Description

DevOps/AIOps Engineer (Platform)Experience: 3–5 YearsAbout the CompanyWe aim to bring about a new paradigm in medical image diagnostics — intelligent, holistic, ethical, explainable, and patient‑centric. We’re looking for innovative problem‑solvers who empathize with clinicians and patients, understand business problems, and can design and deliver reliable, intelligent products.Key Responsibilities·        CI/CD for services & models: Own pipelines (GitHub Actions/GitLab CI), environment gates, artifact/version governance (containers, models, SBOMs), safe rollouts & instant rollbacks.·        Kubernetes platform (EKS preferred): Operate multi-env clusters; Helm/Kustomize; GitOps (Argo CD/Flux); progressive delivery (canary/blue green/Argo Rollouts/Flagger).·        Serving & APIs: Deploy and tune FastAPI services and Triton/ONNX/TensorRT inference; traffic shaping, runtime config, autoscaling signals.·        Event-driven orchestration: Build robust consumers/producers on RabbitMQ/ActiveMQ/Kafka with back-pressure, dead-lettering, idempotency, and retry patterns.·        Observability & AIOps: Define SLIs/SLOs and error budgets; metrics/logs/traces (Prometheus/Grafana/Loki/Tempo/ELK); intelligent alerting & noise reduction; basic model/data drift hooks.·        Security in SDLC: Supply-chain security (image signing/provenance, SBOM scans), SAST/DAST/IaC scanning, policy-as-code (OPA/Gatekeeper), secrets hygiene in pipelines/workloads.·        Data/Model platform integration: S3/MinIO for artifacts; integrate model registry (MLflow or similar) into CD; immutable, traceable releases.·        Resilience & performance: Capacity planning (incl. GPU), autoscaling (HPA/VPA/KEDA), caching/queue tuning; chaos/game-days; write runbooks and own incident response for platform services.·        Developer experience: Golden paths, starter repos, internal Helm charts, docs & enablement to make shipping boring and fast.·        FinOps mindset: Cost dashboards, right-sizing, bin-packing, GPU utilization policies, spot vs on-demand strategy. Skills and Qualifications (Required)·        3+ years in DevOps/SRE/MLOps with strong Docker & Kubernetes fundamentals.·        Production CI/CD expertise; canary/blue-green; artifact & version management.·        IaC (Terraform) and GitOps workflows (Argo CD/Flux).·        Observability: Prometheus/Grafana; logs/traces with Loki/Tempo/ELK.·        Production message queues (RabbitMQ/ActiveMQ/Kafka) with back-pressure & retries.·        Cloud experience (AWS/GCP/Azure), EKS preferred; object storage (S3/MinIO); model registries (MLflow or similar).·        Security in SDLC and compliance guardrails for PHI-like data (least-privilege IAM, secrets, auditability).·        Incident response experience; writing SLIs/SLOs, runbooks, and operating to error budgets.·        Scripting for platform tasks (Python/Bash). Preferred·        Triton Inference Server, ONNX/TensorRT optimizations; GPU scheduling on K8s (NVIDIA device plugin, MIG, node pools).·        Argo Rollouts/Flagger, Karpenter, KEDA; caching layers (Redis/NVCache patterns).·        Policy-as-code (OPA/Gatekeeper), image signing (cosign), SBOM tools (syft/grype).·        Network savvy for app delivery (ingress, service meshes, egress policies). EducationBE/B.Tech (MS/M.Tech a bonus) or equivalent experience.Location & Work SetupOn-site - Gurugram