Job Description

ResponsibilitiesCore AI/ML FundamentalsSolid understanding of AI/ML concepts including:Classification, regression, neural networks OCR and transcription systems Audio/Video processing and multimodal learningOCR, Transcription & Audio/Video IntelligenceImplement specialized models for:High‑accuracy document OCR Real‑time audio transcriptionArchitect deep learning pipelines for audio/video analysis and generation. Integrate multimodal models (e.g., LLaVA, Whisper) into broader GenAI systems.Generative AI & LLM ExpertiseStrong understanding of:Generative AI techniques Transformer architectures RAG (Retrieval-Augmented Generation) pipelines Modern LLM ecosystemsHands-on experience with:LLM parameter handling and model selection Scaling strategies and performance optimizationExpertise in:Prompt engineering and instruction tuning Prompt tuning and optimization for high-quality outputsFamiliarity with evaluation frameworks covering:Quality, grounding, accuracy, safety Latency and cost analysis Governance and compliance requirementsAgentic AI SystemsExperience designing and building agentic AI systems including:Multi‑agent orchestration Tool‑use workflows Autonomous task executionDesign long‑term memory architectures:Vector-based memory systems Graph-based memory for complex, persistent contextModel Training, Fine‑Tuning & OptimizationKnowledge of fine‑tuning approaches:LoRA, QLoRA, supervised fine-tuningExperience with model compression techniques:Quantization, distillationFamiliarity with performance-level tooling:CUDA, Triton, or specialized custom kernelsDesign AI systems capable of efficiently handling and routing multiple user requests simultaneously, including:Scalable request handling Load‑balanced inference Multi‑tenant model utilization Caching and prioritization strategiesInfrastructure, Pipelines & DeploymentOversee:Data pipeline integration Training workflows ML CI/CD processesStrong understanding of:GPU/compute requirements Cost‑efficient deployment strategiesExperience designing and managing production-grade inference servers using:vLLM Text Generation Inference (TGI) SGLangAbility to collaborate with engineering teams to integrate LLMs into production systems:APIs, microservices, cloud architecturesResearch, Evaluation & Continuous InnovationStay current with advancements in AI, ML, and LLM ecosystems. Evaluate new tools, frameworks, and platform technologies to continuously enhance system architecture.

Job Title

Company : Impetus

Location : Noida, Uttar Pradesh

Created : 2026-02-23

Job Type : Full Time