This is an opportunity for a technical problem solving SRE to join a leading global fintech who are currently growing their presence in Europe.Read on to find out what you will need to succeed in this position, including skills, qualifications, and experience.Role Responsibilities; Investigate, troubleshoot and diagnose incidents Provide first-third line investigation and diagnosis of incidents and Service Requests.Be the Incident coordinator for operational incidents on the core engineering production platform.This includes all technical internal communications, ensuring processes are followed and all post-incident follow up and analysis.Escalate incidents or services requests that require system, config or code changes to appropriate on call Engineer Manage engineering service requests, prioritizing requests according to urgency/impact and ensuring requests are serviced in timely manner Work with engineers to establish or update runbooks and procedures needed for handling incidents and Service Requests.Develop and maintain knowledge base and respond to customers technical questions.Actively monitor integration endpoints and external programatic dependencies (i.e venue APIs).Maintain scripts, dashboards and other programatic tools acquired or built Qualifications & Required Skillset; Ability to diagnose and troubleshoot technical issues both offline and in real-time Ability to handle multiple priorities and deal with ambiguity Experience with incident and problem management processes Experience working as an Application Support / DevOps or SRE Role (preferably with in Trading & Risk Management systems ) Experience communicating to customers as well as to sr.software engineers Experience with Python, PostgresSQL and Unix Experience with writing intermediate to advanced SQL queries for data extraction and troubleshooting purposes.Experience with using and troubleshooting programming interfaces especially REST APIs and Web Sockets.Experience with monitoring tools (Grafana, DataDog) Experience working with Crypto and blockchain (DLT) Familiarity with common engineering development workflows and tools (e.g.JIRA, Confluences, github, scrum, etc) Familiarly with scaling, monitoring, and general production challenges of real time (banking) systems.Familiarity with financial services infrastructure & processes (e.g ITIL) and related systems in an SRE or Dev/Ops capacity Familiarity with AWS Cloud Infrastructure & Processes Familiarity with Release management processes and SDLC using agile methodologies and best practices.Motivated by working with people and solving their problems Understanding of basic programming constructs (loops, conditionals, data types, regular expressions) with the ability to write and read non-trivial production and operational scripts.
Job Title
Site Reliability Engineer