About SuprSend:SuprSend is redefining notification infrastructure for businesses, enabling seamless communication at scale. Our platform ensures reliability, scalability, and efficiency in delivering notifications for the world’s most demanding applications. We’re looking for talented engineers passionate about building robust, high-performing systems to join us on our journey.Role Summary:We are seeking a DevOps Engineer/Site Reliability Engineer (SRE) to join our growing team. The ideal candidate will have extensive experience managing high-scale, distributed systems and expertise in modern DevOps practices. You’ll ensure that our systems are robust, efficient, and maintainable while supporting millions of notifications across our platform.Key Responsibilities:Infrastructure & Orchestration• Build and maintain highly available systems usingKubernetesandHelm . • Design and implement scalable solutions for real-time event streaming withKafkaorPulsar .Data & Storage Systems• Optimize and manage data pipelines and storage systems likeClickhouse ,PostgreSQL , andCassandra . • Implement high-performance data architectures to support analytics and transactional systems.Cloud & Automation• Architect, deploy, and manage cloud-based infrastructure onAWSorGCP . • Automate infrastructure provisioning, scaling, and monitoring usingGitOpsand Infrastructure-as-Code tools. • Previous exposure to building/maintainingBYOCimplementation for SaaS solution would be big plus.CI/CD & Reliability• Enhance and maintainCI/CD pipelinesfor reliable, automated deployments. • Implement observability tools and practices to monitor system performance and detect issues proactively.Collaboration & Support• Partner with developers to integrate SRE best practices into the development lifecycle. • Lead incident management, root cause analysis, and preventive measures for system reliability.Required Skills & Experience:•Core Proficiencies: • Expert-level understanding ofKubernetesandHelmfor managing containerized applications. • Strong experience withKafkaorPulsarfor real-time data processing. • In-depth knowledge of databases likeClickhouse ,PostgreSQL , andCassandra . • Proven expertise withAWSand/orGCP , including networking, storage, and compute services. •DevOps Practices: • Extensive experience withCI/CDtools (e.g., Jenkins, GitLab CI/CD, ArgoCD). • Proficiency inGitOpsworkflows and Infrastructure-as-Code tools (Terraform, Pulumi, etc.). •Experience: • 4+ years of experience in DevOps or SRE roles, ideally in high-scale environments. • Prior experience working on distributed, fault-tolerant systems in high-traffic companies. •Soft Skills: • Strong problem-solving and analytical abilities. • Excellent communication and teamwork skills.Preferred Qualifications:• Familiarity with observability tools likePrometheus ,Grafana , orNew Relic . • Experience with disaster recovery planning and implementingRTO/RPObest practices. • Understanding of security best practices in cloud-native environments.What We Offer:• Work on challenging, high-impact projects with a passionate and skilled team. • Competitive salary and ESOP. • Professional growth opportunities through training, conferences, and certifications. • Opportunity to shape the future of notification infrastructure for global businesses.Join us to solve engineering challenges at scale!
Job Title
DevOps Engineer/SRE