Building Patient Safety Skills: Common Pitfalls When Conducting a Root Cause Analysis

April 22, 2010

Most hospitals are acquainted with the root cause analysis (RCA) process and have conducted numerous RCAs in the past 15 years since The Joint Commission first required its use to investigate sentinel events. RCA is the most basic type of event investigation; an analytical approach to problem solving that seeks to identify why adverse events happen and how to prevent them.

Through our consultation services, ISMP has had an opportunity to review many RCAs associated with medication-related events. While we have seen a steady rise in the use of this tool, we continue to observe common pitfalls encountered while conducting a RCA, often rendering the process less useful than intended.

These pitfalls are not surprising given the lack of well-designed patient safety and quality improvement curricula available to healthcare professionals during their training and post-graduation. Many healthcare professionals learn the science and skills associated with quality improvement and patient safety—including RCA—through informal on-the-job training (although workshops on these topics have been available periodically). Most would agree that not enough has been done to prepare healthcare professionals to anticipate, identify, analyze, and resolve patient safety problems.

ISMP plans to present a series of articles on event investigation, data analysis, and prospective risk-assessment that will be published periodically during 2010. Skills in these areas are pivotal to patient safety and quality improvement. We begin this week with a discussion about pitfalls ISMP commonly observes with RCA.

Skipping the chronology

Many RCAs do not include a sequence of events, flow chart, and/or narrative that adequately describes what actually happened. To be effective, a RCA must start with an accurate sequence of events and timeline to help uncover all the gaps where human error or unsafe behavioral choices were made. This helps define the problems that need to be addressed, understand the relationship between contributory factors and the underlying causes, and ensure that all aspects of the event are analyzed. Although developing an event chronology is time consuming, it is a step that should not be skipped despite time constraints and a desire to quickly “get to the bottom” of the event.

Reliance on policies and procedures

Some RCAs fail to uncover “real life” conditions that led to an event because the team relies too much on what is written in policies and procedures to illustrate what normally happens when care is provided. Table 1 RCA lists basic questions that should be answered during a RCA. Question #2—What normally happens?—is often skipped, and the team moves on to Question #3—What do the policies and procedures require? Knowing the norm—the “real life” practices—helps determine the reliability of processes and how often staff cut corners to get the work done. ISMP has also observed over-reliance on policies and procedures by some regulatory and licensing agencies that investigate events. When the agencies issue no citations because the policies and procedures look great on paper, the organization or RCA team may feel compelled, or find it easier, to stand behind the regulatory or licensing agency’s findings of “no system issues.”

Failure to conduct at-risk behavior investigation

RCAs often fail to closely examine the behavioral components of an error, an important omission. When an event involves staff who cut corners, breech a policy, or did not follow a procedure, the conditions that led to these at-risk behaviors are rarely investigated to uncover incentives that encourage the behavior, unintended consequences that discourage safe behavior, or why the risks associated with the task have faded to allow the behaviors to occur. Instead, the investigation stops with the identification of the cut corner or breeched policy, which often results in punitive action for the involved individuals. Each at-risk behavior should always be investigated further to determine its causes, which most often reside in the organization’s culture or design of systems.

Failure to identify deep-seated latent failures

Many RCAs do not dig deep enough to uncover the deep system-based causes of events, or latent failures. To learn about latent failures, probing questions must be systematically asked about how the organization was managing information, the environment, human resources, equipment/technology, and associated human factors at the time of the event. See Table 2 for examples of probing questions for medication events. The process of repeatedly asking “why” when a system or human factor has been identified as contributory leads to uncovering more deep-seated latent failures in the system.

Failure to conduct human error/human factors investigation

The investigation of an event sometimes ends when “human error” has been identified as the cause. However, a human-error investigation should always occur to uncover any preexisting performance shaping factors (e.g., task complexity, workflow, time availability/urgency, process design, experience, training, fatigue, stress) or other environmental conditions, system weaknesses, or equipment design flaws that allowed the error to happen and reach the patient. The investigation is incomplete if it ends with human error as the root cause because it fails to uncover how human errors get through the system and reach patients—information that is critical when planning the redesign of systems.

Failure to seek outside knowledge

RCA teams may get so involved in analysis of the specific event that they fail to recognize the value of looking outward for similar occurrences or related literature to see what could be learned. Internal error databases might uncover related events that have not led to harm, which can help identify and clarify risks. Also, professional literature, including research and anecdotal case reports, often helps in the analysis of the event and the selection of high-leverage, evidence-based, risk-reduction strategies. Applicable regulations, standards, professional guidelines, and consultation with clinical and safety experts can greatly enhance the RCA process and lead to greater success with interventions. We have also encountered RCA teams that are so entrenched in discussions that they fail to move out of the meeting room to visit the clinical areas involved in the event to observe the environment and processes firsthand or conduct a safe simulation, when possible.

Not linking the causation to the actions

The RCA action plan sometimes fails to clearly show a link between the proposed actions and the causative factors. To achieve buy-in for the action plan, it is important for administration and staff to be able to follow the logic of the RCA team. Each intervention should be clearly linked to one or more causative factors. Another factor is the veil of secrecy under which RCAs are performed. Although confidentiality is important during a RCA, enough information needs to be shared with staff who will be required to implement changes so they understand the purpose and importance.

Selecting weak risk-reduction strategies

The most effective risk-reduction strategies involve redesigning systems to make them more resistant to human error, and enabling staff to make safe behavioral choices by removing the system- and cultural-based incentives for cutting corners. Yet, developing new rules and educating staff—considerably weak interventions—are among the most common risk-reduction strategies found in RCAs. Next in line is often a manual downstream double-check that does little to prevent the errors upstream. Strategies that rely heavily on human memory and vigilance are much weaker than strategies that prevent staff from carrying out tasks the wrong way, “force” them to carry out tasks the correct way, or involve automation to provide just-in-time decision support, verify accuracy, and halt progress when errors are made. Layering action plans with multiple strategies also helps ensure success.

Failure to carry out the action plan and measure success

A RCA is only useful if it results in positive change. Yet, we sometimes encounter RCA action plans with critical interventions that have not been implemented or without realistic plans for future implementation. Progress with reaching goals has not been monitored, and a structured format does not exist to support implementation of the action plan and monitor accountability. Some changes that have been implemented are later abandoned because they were designed without consideration of the workflow, barriers were encountered and not addressed, the reason for change was not clearly communicated to staff, or measures were not in place to quantify and monitor the scope of change and its effect on patient safety. Staff require motivation to initially change and data that links the change to positive patient outcomes to sustain the change. Interventions need to be tested on a small scale and revised as necessary, and then spread throughout the organization in all applicable areas. Even the best laid plans don’t always work out; if that happens, the RCA team needs to develop new ways to deal with the risks.

Focus too narrow or too broad

Sometimes RCA teams don’t look broadly enough at the risks they uncover to determine if the same risks are present in other parts of the organization, or among other processes of care. For example, a deadly mix-up between look-alike products in one area of the hospital could happen in another area of the hospital. Yet, we often see interventions targeting just a single unit, service, or department. Or the RCA team may not address other products that look similar to the ones that were confused. Once risks are identified, the focus that was appropriately narrow during initial analysis of the event needs to widen to analyze the same or similar risks throughout the organization and among other care processes. Likewise, interventions addressing these risks should not be narrowly defined for implementation only in the immediate area involved in the event. On the other hand, we have occasionally encountered organizations that attempted to learn too much about distant system issues from a single event. This most often occurs when assumptions are made about risk and how interventions can be implemented without investigation or input from clinical areas that are dissimilar (e.g., inpatient and outpatient services).

Unjust punitive action

Some RCAs have been weakened by unjust punitive action taken against involved practitioners shortly after the event, largely due to hindsight bias and a prevailing but unfair outcome-based justice system in healthcare in which the patient’s outcome dictates the degree of punishment. We have also observed organizations holding involved clinicians accountable for duties that did not exist before the event or were not applicable given the situation, such as performing a double-check that might have averted the bad outcome but was not a required procedure, or calling a physician when the individual was unaware conditions warranted such an action. In either case, the RCA team is more inclined to focus on shortcomings of the individuals (as determined by organizational leadership often before the RCA begins) and less inclined to uncover underlying system causes of these actions. Further, due to punitive action, individuals involved in the event may not be available to provide important details during the analysis, often leading to inaccurate assumptions.