How to set up incident reporting (1/2)
Accidents hardly ever happen without a warning. Everyone has at some point tried a near miss: You almost fell on you bike hitting the curb – but luckily you didn’t. You sigh in relief, learn from it and move on. In a company setting the same logic applies. In this case incident reporting lets your company learn and perhaps prevent future accidents.
In this article I will do my best to help you prepare for unexpected events. I will assume that your goal is to reduce their impact and prevent that they will ever happen again.
In this first part I will give you an introduction to incident reporting and in the second part to incident management. You will also learn how to capture valuable lessons from your colleagues.
Along the way we will find inspiration from incident reporting in the standards covering this area: Health, Safety and Environment (ISO 45001 standard) and Information security (ISO 27002 standard).
- Learn more about ISO 27002 standard
- Learn more about ISO 45001 standard
Lastly, I will give you a step-by-step guide on how to create an incident reporting process using the process management platform Gluu.
What are incidents, non-conformities or deviations actually?
For the sake of simplicity we will take a high level approach to the subject. There are lots of definitions within the field and this article will take them into account and use them to create a usable process.
The process literature and ISO standards have the folowing definitions we can use:
“The non-fulfillment of a requirement”
“Departure from an approved instruction or established standard”
“Operational event which is not part of standard operation”
“deficiency in a characteristic that renders the quality of a product unacceptable, indeterminate, or not according to specified requirements”
The above seems to hold a few characteristics (yes – this is where we simplify a bit):
- An unplanned event happens,
- A planned event doesn’t happen.
The “non-conformance”, in this case, is product oriented and describes the effect of the adverse event rather than the cause. Looking beyond the actual product the effect of an undesirable event is everything from near misses, human harm and accidents, compliance issues to lost opportunities.
Moreover, in healthcare incidents (usually called “adverse events”) are both costly, harmful and, to some extent, preventable.
“Estimates show that in high-income countries, as many as one in 10 patients is harmed while receiving hospital care. The harm can be caused by a range of adverse events, with nearly 50% of them considered preventable.”WHO
Everyone can agree that accidents should be prevented. If nobody wants accident then why doesn’t correct incident handling happen every time?
Sadly – there are several factors contributing against efficient incident reporting.
Below is a deep dive into three of the main reasons that incident reporting is more difficult than it should be.
Why not help your colleagues?
Who wants to be the idiot who admits there was an incident?
In a blame culture no one wants to admit they made a failure.
If the incident didn’t directly affect anyone – why report it and stand out in a negative light?
Remember: When you set up an incident reporting system, then the company must foster a culture of openness and trust. Incidents must be seen as opportunities for organisational learning, rather than individual failures.
IT can help by ensuring anonymity and removing potential penalties for the reporter. Yet this contradicts the goal of creating culture of openness where everyone works towards a common goal of systematic improvement.
Management must spearhead this cultural change and embrace incidents potential for innovation and improvement. For more on this please read the Harvard Business Review on “The failure-tolerant leader”.
Failures are – after all – better than repetitive failures.
It’s too difficult to report an incident
If the process of reporting an incident is too much of a hassle, there is a good chance that it will not be reported at all. Everyone doing incident analysis want more data, but remember that front-line employees are busy bees. If the incident is not reported shortly after the event employees will move on to the next task on their todo-list.
“Bureaucracy is the art of making the possible impossible”Someone disillusioned
Striking a balance between getting enough data to understand the incident and making the process lightweight enough to ensure that it is completed, is essential to getting the data needed.
Ease of use for everyone is crucial to get data at all.
No response or effect of the incident report
As for any effort it is important to see that your input matters. If there is a sensation that management will just ignore the reported incidents, chances are that future incidents won’t be reported.
Saying ‘thank you’ for the report, notifying when it is being handled (or even implemented) is a simple, effective and inexpensive way to show appreciation. Gratitude does not have to be monetary, especially when incident reports help the entire company.
Again – without incident reporting from employees there is no data to prevent future accidents. So keep this cultural factors in mind.
Now let’s look at a possible solution!
A how-to guide for incident reporting in Gluu
The common sense approach
Let us put the ISO standards and the fancy words aside for a moment. What is this really about?
Most people have a great gut feeling if something is off. When that wire hanging out from the wall is making strange noises most people will notice.
Luckily most people also have the ability to help out using only common sense. Turn of the switch or find someone who know how to.
After the “Pheew – I’m glad no one died” – moment we need to clean up any mess and make sure that it will not happen again.
This whole article is about learning from other peoples experiences. Looking at the ISO 45001 standard and ISO 27002 standard is exactly that, as it is a formal, experience-based approach to common sense.
Two main types of incidents
To get to the process level we need to dive into two very different types of incidents to see what we can learn from them.
These a two very different incident types that can occur in any organisation. Each is covered by its own ISO standard:
- Health, Safety and Environment (HSE) incidents
(ISO 45001 standard)
An event not causing harm, but has the potential to cause injury, loss of property or material or accidents under similar conditions. For example, not wearing a helmet on a construction site. In itself it doesn’t matter, but due to the hazardous environment protection is key to prevent accidents.
- IT Cyber security incidents
Unlike an actual data breach, a cyber security incident doesn’t necessarily mean information is compromised; it only means that information is threatened. For example, an organisation that successfully repels a cyber attack has experienced an incident, but not a breach.
Based on knowledge from the ISO 27002 and ISO 45001 standard, we will create a incident reporting process in Gluu.
First step: Prevent the incident from becoming an accident
The first activity should always be to stop (if possible) anything bad going on. Let me give you some examples.
What to consider for HSE incidents?
Health and safety incidents usually require physical intervention:
- Electrical plug halfway out?
– put it back in.
- Machine out of control?
– turn it off.
- Soap on the ship deck?
– clean it up.
Please remember that you must keep yourself safe in the process!
Secure the scene by barricading the area if possible and prevent any further entry thus preventing your colleagues from harm.
What to consider for IT incidents?
Cyber-crime is harder to discover. There are rarely masked people raiding the server room. Intervention comes in many forms – if it is possible at all.
You can maybe prevent phishing emails from being forwarded (or alert emails can be sent to everyone) and if you suspect that someone has the root password, it might be a good time to change it.
How to set up a report incident process in Gluu
Based on the article “How to do simple process mapping” I created a process named “Report incident”. The intended outcome of the process is simply: “Incident has been successfully reported”.
As mentioned before incident reporting is everyones responsibility. To give everyone access (and the obligation) to report incidents, using the role “Regular employee” will ensure that all employees have access.
Even with the best intentions and a strong common sense no one knows how to handle everything. Specialised situations sometimes requires specialised knowledge.
Once the “regular employee” have precented what can be prevented the responsibility should be passed to someone within the field. A line manager can turn the bulldozer off without causing more harm.
As with anything in life – no one is responsible outside their ability.
Splitting the action between two roles hopefully ensures that further escalation is prevented.
- “Knowledge gained from analysing and resolving information security incidents should be used to reduce the likelihood or impact of future incidents.”
- ”Identification, collection, acquisition and preservation of information, which can serve as evidence.”
Part two in the ISO 27002 priorities is meant for a potential lawsuit, which we will not look into in this article.
Common for both is need for proper reporting to prevent recurrence: All available information must be secured for further analysis if possible under the circumstances. This could be a tad conflicting with escalation prevention, as the prevention might cover or destroy valuable documentation. But overall it seems like a reasonable tradeoff – especially if your colleague is on fire.
It is impossible to make an exhaustive list of what to secure. In this case you should consider screen dumps, audio, photos, log files – as everything counts.
The high-level work instruction is very straightforward: “Secure all the information you can”. Over time the organisation learns from both causes and (adverse) effects and would be able to improve the work instruction based on past experience alone.
Process incident reporting needs occasional revisions to stay in touch with reality.
It is generally not a good idea that the same a role have two activities right after one another. For the sake of simplicity they should be combined to create a better overview.
In this case the activities are thematically very different and so is their work instruction. So two separate activities makes sense.
The activity “Secure information” should include the following tasks for the Line manager in order to provide enough information for further analysis:
- Pictures / screendumps of the incident (visual)
- Description of what happened (text)
The “Secure information” tasks in Gluu
The last activity task: “Fill out the incident report” is a Gluu asset build in the Form builder that can collect input from in a predefined and structured way.
The ISO 27002 (section 16.1.2) has certain requirements for categorisation reporting.
We can add some common sense categorisation to the HSE form knowing that more causes can be added over time.
ISO 27002 and ISO 45001 standard forms
ISO 27002 form in Gluu
ISO 45001 form in Gluu
To end the process we will add a nice endpoint to illustrate that the observed incident now has been reported – which was our desired process outcome to begin with.
Going from reporting incidents to managing incidents
When looking at the broader perspective – there is more that we can do. Now that the incident reporting is in place it is time for incident management. To see how this is done do read my article How to handle incident management.