Job Description

Role - MLOps Location - Remote Experience - 5 + Yrs Responsibilities: ● Build and optimize model serving infrastructure with a focus on inference latency and cost optimization ● Architect efficient inference pipelines that balance latency, throughput, and cost across various acceleration options ● Develop monitoring and observability solutions for ML systems ● Collaborate with ML Engineers to establish best practices for optimized model deployment ● Implement cost-efficient, enterprise-scale solutions ● Collaborate in a cross-functional, distributed team for continuous system improvement ● Work with MLEs, QA Engineers, and DevOps Engineers ● Evaluate and implement new technologies and tools ● Contribute to architectural decisions for distributed ML systems Experience and Qualifications: ● 5+ years of experience in software engineering with Python ● Experience with ML frameworks, particularly PyTorch ● Experience optimizing ML models with hardware acceleration (AWS Neuron , ONNX, TensorRT) ● Experience with AWS ML services and hardware-accelerated instances (Sagemaker, Inferentia, Trainium) ● Proven experience building and operating AWS serverless architectures ● Deep understanding of event-driven processing patterns, SQS/SNS and serverless caching solutions ● Experience with containerization using Docker and orchestration tools ● Strong knowledge of RESTful API design and implementation ● Proficiency in writing good quality & secure code and be familiar with static code analysis tools ● Excellent analytical, conceptual and communication skills in spoken and written English ● Experience applying Computer Science fundamentals in algorithm design, problem solving, and complexity analysis

Job Title

Company : Recro

Location : Bengaluru, Karnataka

Created : 2025-07-26

Job Type : Full Time