Skip to content

Mastering Operational Resilience: Strategies for Scenario Testing

Operational resilience scenario testing strategy

Until firms experience a live incident, Operational Resilience scenario testing is the only way for firms to understand their capability to remain inside Impact Tolerance and meet the objectives of the regulation. We've conducted testing or reviewed testing with almost 30 firms over the past three years and have spoken with many more in meetings and through our events and webinar programme.

We've distilled our experiences into a handful of lessons to support firms in their testing maturity.

First things first...what should a scenario test be trying to achieve?

According to FCA PS 21/3:

"Our policy covers disruptions inside and outside of a firm’s control. To prepare for such disruptions, firms need to test their impact tolerances in a range of severe but plausible scenarios. This approach will give firms a clear idea when they initially test their impact tolerances of what such unexpected events may mean if they cannot remain within tolerance."

The last point is key for firms to remember as they undertake their testing activity. Where do "...such unexpected events...mean they cannot remain within tolerance?"

So, the primary goal for a scenario test should be to identify where a firm is unable to remain inside Impact Tolerance following a severe but plausible scenario.

If the test isn't going to teach a firm anything new then its value is debatable, so fundamentally, a scenario test should teach a firm something about its operational resilience capability that it doesn't already know, providing feedback to that firm on where to improve or invest.

That leads to the first lessons we've learned from client programmes.

Ineffective scenario development

We've seen scenarios that are too severe and implausible ("shock and awe scenarios") such as an AWS outage which coincides with a COVID-24 outbreak and a fire in the server room. There's even an argument to say that for most firms, testing a scenario such as an AWS outage will be redundant as the firm will be wholly reliant on recovery and will not have the resources or appetite to invest in a contingency.

At the other end of the scale, many firms we've seen develop scenarios that are too vague or not severe enough to truly test operational resilience capability.

In these instances, we've felt that the firm is only testing to evidence systems recovery capability, i.e. do documented Recovery Time Objectives enable the firm to recover inside Impact Tolerance?

Scenarios should be specific to the firm, evidencing known vulnerabilities, which have been identified through mapping to a sufficiently granular level and then conducting a robust threats and vulnerabilities analysis.

Firms should make the scenario specific and relevant, using live firm data to support the scenario and illustrate the likely impact to customers and stakeholders.

Finally, per FCA PS21/3, scenarios should be severe enough "...to gain assurance of the resilience of their important business services and identify where they might need to act to increase their operational resilience."

Again, scenarios should show a firm where and how they are vulnerable.

Testing end-to-end resilience capability

The second lesson we've identified is that firms should try and test the full spectrum of resilience capabilities as part of the scenario testing process. We would expect a firm to incorporate identify, prevent, respond, and adapt, recover and learn capabilities in their scenario testing.

Scenario testing should help firms understand all of their operational resilience capabilities starting from how they understand the incident impact to how they can prevent disruptions in the first place, to potential workarounds and mitigants, and to recovery capability.  

Firms should think about segmenting their testing to address and ask simple questions of IBS and pillar owners:

  • Identify - What capabilities exist in the firm to identify a scenario like this?
  • Impact - How many customers are impacted? How are those customers likely to be impacted? Will intolerable harm be caused?
  • Prevent - What controls exist to prevent or reduce the impact of such a scenario? Why might they have failed?
  • Respond - What are the response priorities? How can you contain the incident and/or prevent contagion? Who plays which roles and responsibilities throughout the incident? Do those people have the right skills and knowledge?
  • Adapt - What workarounds are available?
  • Recover - What capabilities exist in the firm to recover? What would we need to do to bring that recovery inside ITOL?
  • Learn - What has the scenario taught us about our operational resilience capability? Do we need to make investments in resources, training, systems?

Don't stop at the workaround

For many firms, the scenario test stops at the identification of a mitigant or workaround. Whilst understanding workarounds and identifying new ones are a clear and valuable output of scenario testing, they must also be tested.

In many cases, we've seen firms spend significant effort on creating new workarounds and processes without applying sufficient due diligence to test these workarounds.   If these workarounds are the last line of defence before breaching Impact Tolerance then firms need to be sure that they stand up to the rigour of their own specific severe but plausible scenarios or a compound of those scenarios.

The final point linked to manual workarounds is that firms need to consider backlogs. Backlogs will inevitably be created by moving to manual processes and firms must consider through testing, at what point customers caught in a manual backlog may experience intolerable harm. Firms should develop a simple model to ensure they are thinking about these backlogs.

Summary

Along with a methodical and granular approach to mapping and risk analysis, a robust testing programme derives the most value from a firm's operational resilience programme by ensuring a firm understands where it is vulnerable and where resilience risks exist.

We hope that firms will adopt some of these straightforward lessons to further enhance resilience capabilities and meet the intended regulatory objectives at the earliest opportunity ahead of March 2025.

 
How FourthLine can help
If you're interested in exploring how FourthLine can support your scenario testing programme, feel free to reach out here or schedule a meeting with one of our experts today.
Read our Operational Resilience Insight Deck
February 29, 2024
Daniel Waltham
Responsible for leading client relationships and new business sales. Dan takes a lead role in customer engagement, identifying, creating and designing solutions to help our customers with risk and regulatory challenges. 13 years of experience working with financial services businesses across risk, compliance, data protection and regulatory change.
Contact Us

Company Number: 6952875

VAT Number: 981375491

Privacy Policy

Complaints Procedure

Code of Conduct

CONNECT WITH US

Stay up to date with industry news, risk and resilience events and webinars.

Copyright © 2022, FourthLine. All Rights Reserved.