Skip to Main Content

Job Title


Site Reliability Engineer


Company : RoboMQ


Location : Jaipur, Rajasthan


Created : 2026-01-26


Job Type : Full Time


Job Description

Location: Jaipur (Rajasthan)Position type: Full timeBefore you apply, make sure you have:- 3+ years’ experience working in a DevOps, Platform Engineer or Site Reliability Engineer Role. - B. Tech degree with relevant technical experience. - Demonstrated ability to be on-call support to handle critical infrastructure issues. - Ability to quickly learn new technologies and implement to our rapidly evolving product and business. - Exceptional verbal and written communication skills. - Experience working on distributed systems.Responsibilities- Maintain and administer multiple multi-node Kubernetes clusters for high availability and optimum performance. - Set up and manage logging, monitoring, and alerting using tools like Prometheus, Grafana, EFK, or CloudWatch. - Design, implement, and manage CI/CD pipelines for seamless deployments. - Work on the cloud infrastructure hosted on AWS to keep it secure and optimized. - Automate infrastructure provisioning, scaling, and security compliance on AWS through Terraform. - Strengthen cloud security through IAM policies, encryption, and vulnerability scans. - Perform root cause analysis and system troubleshooting and implement improvements. - Work with Penetration testing tools like NMAP to analyse and improve network security. - Strengthening overall security including infrastructure security, webapp security and IAM security.Key Skills [Must have]- Strong hands-on experience with Docker and Kubernetes. - Strong understanding of Git and version control. - CI /CD: Jenkins, GitHub, GitHub Actions - Infrastructure as Code (experience on Terraform) - Experience of deploying and managing cloud-based applications, preferably on AWS. - Cloud Networking & Security fundamentals (IAM, firewalls, SSL, encryption). - Excellent knowledge of shell scripting. - Cyber Security: OWASP Top 10, NMAP, ZAPAdditional Skills [Good to have]- Helm charts: kOps - SonarQube - Monitoring: Prometheus, Grafana, Alert Manager. - Logging: Elastic Search, FluentD, Kibana - Networking: Istio, Kong - Hands on experience with a programming language. - Experience with message queues (Kafka, RabbitMQ, SQS) - Familiarity with SRE (Site Reliability Engineering) practices