cover image
Stripe

Incident Response Manager

On site

Moreton-In-Marsh, United Kingdom

Full Time

17-03-2025

Job Specifications

Who we are

About Stripe

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Our mission is to increase the GDP of the internet, and we have a staggering amount of work ahead. That means you have an unprecedented opportunity to put the global economy within everyone’s reach while doing the most important work of your career.

About The Team

The Incident Ops team is a global 24/7 team responsible for driving incident response and management from detection to resolution. Stripe is proud of its five 9s reliability and this team is at the forefront of ensuring we keep it that way - working hand-in-hand with Reliability Eng and across the Tech Org. This team of incident response managers (IRM) is defined by our sense of ownership and how we drive incidents to resolution - marshaling the necessary cross-functional resources to respond to and resolve service outages, critical bugs, security attacks and anything that significantly impacts the users of our products. The team is user-first and ensures appropriate external communications from Stripe and senior management to keep our users informed of disruption to their experience of Stripe. The team is skilled in communications, incident handling and technical adeptness as incidents can arise from anywhere and cut across products and orgs in Stripe.

What you’ll do

As an Incident Response Manager (IRM), you’ll play the key role in driving the right level of response from Stripes to incidents, determining impact, rallying Stripes to mitigate, communicating to users and ensuring appropriate remediations and orchestrate the Root Cause Analysis (RCA) process. You’ll work hand-in-hand with IRMs and engineers globally to ensure solid 24/7 coverage on how we monitor, detect, respond, communicate and mitigate incidents. When not managing incidents, you'll help scale our ability to respond to incidents, improve our operations, analyze data to provide insights and deepen our technical expertise in products. As a result, you’ll be seen as the protector of our users - in minimizing the impact of incidents on their business and ensuring that Stripe is always thinking of our users.

Responsibilities

Act as an on-call Incident Commander, responsible for driving and managing incident resolution with a high level of urgency, cross-functional collaboration, and accuracy, while partnering with a global and diverse set of teams, including Engineering, Product, Policy, Risks, PR, Legal, Execs, etc.
Lead all user-facing incidents across domains at Stripe - including reliability, technical, security, and data privacy
"User First" approach to determine impact, providing accurate situation reports, facilitating comms bridges, and ensuring useful and timely external communications to users
Proactively update internal stakeholders, make decisions through data and influence by partnering with Engineering, Sales, Support and other cross-functional teams
Contribute to the root cause analysis process while conducting post-mortems, remediations identification, and ensure problem management tasks meet SLA and user expectations
Drive improvements in the incident handling process and incident management metrics and tooling based on trends and data of Stripe's incidents in collaboration with engineering, product and operations teams
Collaborate closely with leadership for building team strategy based on the team vision
Collaborate and coach other Incident Response Managers on the team


Who you are

We’re looking for someone who meets the minimum requirements to be considered for the role. If you meet these requirements, you are encouraged to apply. The preferred qualifications are a bonus, not a requirement.

Minimum Requirements

5+ years of demonstrable major incident experience for organizations that run mission critical applications or always-on Saas environments.
Demonstrated ability to lead multiple incidents concurrently with authority and influence responders with agency and reasoning skills to resolve ambiguous problems and drive to root cause.
Strong full stack technical skills with development/support experience with cloud based technologies
Demonstrated experience developing code and automation using Python, Ruby, JavaScript or shell scripting.
Solid understanding of infrastructure, including physical, virtual, and container-based compute platforms
Strong quantitative, and analytical skills in data manipulation using SQL, Splunk or other tools.
Excellent task management skills, must be detail-oriented with ability to remain composed, methodical, and think fast in a high-pressured environment.
Exceptional written and verbal English communication skills, with the ability to translate complex technical issues for internal and external stakeholders


Preferred Qualifications

Domain expertise in classes of incidents such as technical, privacy, security or crisis with a strong desire to continuously learn about Stripe's products, technical issues and systems.
Ability to review complex technical details regarding ongoing issues/events and convey the key details to senior stakeholders to facilitate real-time decision making.
Experience with broad user-facing communications (e.g. status pages, tweets) and/or targeted communications (e.g. direct emails, support ticket responses).
Familiarity operating or managing distributed architectures with the ability to correlate system behaviors based on known inter-dependencies.
Demonstrated experience with full stack development and support


Hybrid work at Stripe

This role is available either in an office or a remote location (typically, 35+ miles or 56+ km from a Stripe office).

Office-assigned Stripes spend at least 50% of the time in a given month in their local office or with users. This hits a balance between bringing people together for in-person collaboration and learning from each other, while supporting flexibility about how to do this in a way that makes sense for individuals and their teams.

A remote location, in most cases, is defined as being 35 miles (56 kilometers) or more from one of our offices. While you would be welcome to come into the office for team/business meetings, on-sites, meet-ups, and events, our expectation is you would regularly work from home rather than a Stripe office. Stripe does not cover the cost of relocating to a remote location. We encourage you to apply for roles that match the location where you currently or plan to live.

Pay and benefits

The annual salary range for this role in the primary location is €77,600 - €116,400. This range may change if you are hired in another location. For sales roles, the range provided is the role’s On Target Earnings (“OTE”) range, meaning that the range includes both the sales commissions/sales bonuses target and annual base salary for the r...

About the Company

Stripe is a financial infrastructure platform for businesses. Millions of companies—from the world’s largest enterprises to the most ambitious startups—use Stripe to accept payments, grow their revenue, and accelerate new business opportunities. Headquartered in San Francisco and Dublin, the company aims to increase the GDP of the internet. Know more

Related Jobs

Company background Company brand
Company Name
hackajob
Job Title
Graduate Production Support Analyst
Job Description
hackajob is collaborating with mThree to connect them with exceptional tech professionals for this role. Want to start your career as a Graduate Production Support Analyst? Paid graduate training, ongoing support, and opportunities at leading global employers - the Alumni graduate program gives you everything you need. (And don’t worry, there’s no training bond. No exit fees, no hidden catches). Here at mthree, we pair great graduates with brilliant global businesses. Our clients include tier-one investment banks and leading global financial services firms providing a wide range of investment banking, securities, investment management, and wealth management services. Role Description Production Support Analysts ( SRE or Dev ops) are responsible for ensuring optimal performance and stability of Front Office pricing, trading, and risk management applications across multiple product lines at our client's site. The role offers a fast-paced and interesting mix of technical and business challenges, allowing the holder to develop an in-depth understanding of the client's trading technologies and businesses. It is a production-oriented discipline focused on improving service availability, latency, scalability, performance, and efficiency for technology products in our client's site. Their core infrastructure processes hundreds of millions of transactions and serves assets of more than a trillion dollars daily. If this scale resonates with you, come join us! What You’ll Do As a Production Support Analyst with mThree, you’ll support and manage complex applications in a global team environment. Monitor, diagnose, and resolve technical issues for Front-office pricing, trading, and risk management applications, ensuring optimal performance and stability Collaborate with global teams to enhance application stability, perform root cause analysis, implement preventive measures, and develop support documentation. Communicate effectively with stakeholders on system status and incident resolutions, continuously seek performance improvements, assist in software deployments, and provide on-call support for critical issues. How The Alumni Program Works Apply via this job advert. Complete our assessment process. Get trained at mthree Academy in an online class for 4-6 weeks with other graduates. Focusing on either Java, Python. Net, and databases. Join a mthree client for 12-24 months while receiving support and salary increases every 9 months. The vast majority then convert to permanent employees with the client at the end of the program. What You’ll Learn At The Mthree Academy How to design, develop, and test a full-stack application. How to demonstrate professionalism, including communication skills and personal interactions. How to explain industry concepts at a high level. An understanding of monitoring, supporting, and troubleshooting, and how it ties into good development processes. What You Need Bachelor's degree in computer science or a related STEM subject (science, technology, engineering, maths). Good grades 2.2 and above. We look for potential, not prestige, but it’s important that you enjoy pushing yourself to pick up new ideas. A passion for Financial Markets. Experience with OOP (object oriented programming) particulary in Java, Python, C++, C. Strong SQL databases experience such as Sybase, DB2, Oracle); able to search log files and compose SQL queries. Skills in troubleshooting and debugging. An understanding of the formal SDLC (software development lifecycle). The successful candidate must, by the start of the employment, have permission to work in the UK What You'll Get Fully paid, in-depth, interactive training in small classes with graduates like you. Our training is created and delivered by industry experts who know your field inside out. A generous graduate salary. Pay rises every 9 months to reflect your progress. Ongoing training and support. Valuable industry experience. Ultimately, a foot in the door to build your career in an in-demand niche. How To Apply Apply below to tell us a bit about you. Complete the online assessment (we’ll send you a link by email). Complete the interviews with our friendly talent team over phone and video. Lastly, you’ll interview with one or more of our clients. We take great pride in celebrating the diversity of each and every individual who contributes to making mthree the company it is today and will be in the future. We value diversity both within mthree and with our partner companies, and we're proud to provide an environment where all our colleagues can flourish. That means promoting a strong culture of equality but, most importantly, inclusion. We never want people to change - only add to the amazing mix of people that work for mthree.
United Kingdom
Remote
Full Time
18-03-2025
Company background Company brand
Company Name
Peregrine
Job Title
Senior Devops Engineer
Job Description
The Senior DevOps Engineer will play a key role in the design, development, deployment, and maintenance of the product suite. This role focuses on implementing DevOps practices, managing cloud infrastructure, and optimising the development lifecycle. The position requires hands-on experience with continuous integration, continuous deployment (CI/CD), cloud environments, and container orchestration, with the goal of ensuring efficient and secure software releases. As a Senior DevOps Engineer, you will be responsible for deploying and maintaining the platform, collaborating with cross-functional teams to implement automation processes that streamline development and deployment. You will optimise cloud infrastructure, ensure platform security, and manage containerized environments. The role also involves implementing CI/CD pipelines, managing cloud resources, and maintaining high-performance computing (HPC) setups, while ensuring that the platform meets security and efficiency standards. Job Description Roles and Responsibilities: Deploy, maintain, and optimise the platform, ensuring efficient and reliable operation. Collaborate with cross-functional teams to design and develop robust cloud-based infrastructure solutions. Implement and manage continuous integration and continuous deployment (CI/CD) pipelines to automate and streamline the software release process. Manage cloud infrastructure using platforms such as AWS and Azure, monitoring usage, optimising costs, and ensuring scalability. Configure and maintain containerization and orchestration tools (e.g., Docker, Kubernetes, Nomad) to ensure efficient application deployment. Apply security best practices to protect sensitive data and ensure the platform meets industry security standards. Provide expertise in DevOps methodologies, continuously improving the development and deployment lifecycle. Optimise cloud infrastructure to meet business goals, balancing performance and cost. Review and maintain technical documentation for software, APIs, and infrastructure. Conduct code reviews for both front-end and back-end components to ensure best practices are followed across the development process. Manage high-performance computing (HPC) setups, such as AWS ParallelCluster or Slurm, to support large-scale data processing tasks. Promote the use of serverless principles and microservice patterns within the development team. Required Qualifications Proven experience in commercial development roles, with a track record of delivering high-quality software in modern environments. Experience working with High-Performance Computing (HPC) setups such as AWS ParallelCluster or Slurm. Strong understanding of cloud-based architecture and development within Docker environments. Experience with serverless principles and microservice patterns. Expertise in modern DevOps tools such as: Terraform GitLab and GitOps Packer Kubernetes Databases (e.g., Postgres) OpenAPI (Swagger) Unittest Strong experience in writing Dockerfiles and optimising Docker images. Familiarity with data engineering concepts and tools (e.g., Pandas, Numpy, Zarr). Solid understanding of security principles and best practices for handling sensitive data. Desired Characteristics Excellent written and verbal communication skills, capable of explaining complex technical concepts to individuals with diverse technical backgrounds. Entrepreneurial drive and the desire to contribute to building an early-stage company. Ability to work in a dynamic and fast-paced environment, contributing to the overall success of the team. Security Clearance Baseline Personnel Security Standard (BPSS) clearance is required and must be maintained for this role. Please note that in the event that BPSS clearance cannot be obtained, you may not be eligible for the role and/or any offer of employment may be withdrawn on grounds of national security. Please see the link below for further details regarding the requirements for BPSS clearance: BPSS
Cambridge, United Kingdom
Hybrid
Full Time
18-03-2025
Company background Company brand
Company Name
hackajob
Job Title
Software Engineering Manager - DevOps
Job Description
hackajob is collaborating with British Airways to connect them with exceptional tech professionals for this role. A career without limits As the nation’s flag carrier, we take great pride in connecting Britain with the world and the world with Britain. It’s something we’ve been doing for over 100 years, ever since we launched the world’s first international scheduled air service between London and Paris. This originality has been in our blood since day one. It’s the spirit we share with the people that fly with us, our partners, and our colleagues. So, whether you are a reassuring voice on the end of a phone, a smile at the door, under a wing keeping the turbines spinning or landing us gently in far-flung places, a job at British Airways is yours to make. We know great things can happen when you’re inspired to think big and bring your ambition to work every day, which is why, at British Airways the sky is never the limit. The Role: Software Engineering Manager - DevOps Join our cutting-edge Software Engineering team at British Airways as a Software Engineering Manager, focusing on platform engineering and DevOps. If you enjoy working hands-on with infrastructure, enabling engineering teams, and building internal developer platforms (IDPs), this role is for you. What You’ll Do Lead the adoption and evolution of platform engineering capabilities to enhance developer experience. Work closely with Platform Team, as well as QA, Delivery, and FinOps teams to optimise platform efficiency and spending. Champion best practices in security, scalability, and reliability across cloud-based solutions. Implement and enhance an Internal Developer Platform (IDP) to support high-performance software development. Guide engineers in adopting modern cloud-native practices, improving system resilience and delivery speed. Get hands-on in solving complex infrastructure and platform challenges, ensuring optimal performance. What You’ll Bring To British Airways A passion for DevOps and Platform Engineering, with hands-on experience in Software Engineering and IT Operations. Strong expertise in Infrastructure as Code / Platform as Code (IaC/PaC), Containerisation, CI/CD, and Automation. A deep understanding of AWS cloud architecture and services. Ability to work closely with development teams to enable self-service capabilities and improve efficiency. A mindset of continuous improvement, always looking for ways to enhance developer productivity. Strong problem-solving skills, with the ability to navigate complex platform challenges. Ability to balance between applying the fundamentals and taking a pragmatic solution. Your Experience Proven experience in DevOps, platform engineering, and cloud infrastructure. Hands-on expertise with tools like Terraform, Kubernetes, Docker, and AWS services. Experience in designing and implementing scalable internal developer platforms (IDPs). Strong knowledge of CI/CD pipelines, observability, and platform automation. If you're ready to redefine software engineering at scale, drive platform excellence, and work hands-on with cutting-edge technologies, we’d love to hear from you. Apply now and be part of something extraordinary! What We Offer We believe that all the people who work with us should feel valued for the part they play. It’s one of the reasons our rewards go far beyond a competitive salary. From the day you join us, you’ll get access to brilliant staff travel benefits including unlimited basic and premium standby tickets on British Airways flights. You’ll also receive up to 30 discounted ‘Hotline’ airfares per year for yourself, friends, and family. At British Airways you’ll have the chance to take on new challenges and move forward in a way that feels right for you. We encourage all those who work for us to consider opportunities right across our business to help you develop and progress. We never stand still, and we don’t expect our people to either. Inclusion & Diversity At British Airways we all have a part to play in creating an inclusive place to work. Diverse representation among our people is really important to us and we recognise that all our colleagues are uniquely different and bring their own originality, creativity and identity to work. Inclusion and diversity is a key driver of innovation and we’re committed to creating a culture where everyone feels that they can be themselves. We’re looking for people from all backgrounds and cultures to join us and be a part of our journey to become a Better BA as we continue to connect Britain with the world and the world with Britain.
London, United Kingdom
Hybrid
Full Time
18-03-2025
Company background Company brand
Company Name
Alfa AI
Job Title
TypeScript Engineer - Remote
Job Description
We're hiring a TypeScript / NodeJS Engineer for one of the UK's fastest growing pension companies. They have an office in London, but you can work remote if that's prefered. On offer is £70-100k + really great benefits. Apply now for more details!
United Kingdom
Remote
Full Time
18-03-2025