Ask A Quality Professional

1.  How to tell when root cause analysis reaches the root cause

Posted 6 days ago
My organization uses root cause analysis to achieve the objective of ISO 9001:2015 section 10.2.1.2 - determining the causes of the nonconformity. Our external auditor has a passion for root cause analysis and has repeatedly warned us that we do not go deep enough to find the root cause of our defects, culminating in a minor non-conformance in the last audit and a promise of a major non-conformance if the issue is not resolved before the next audit. We have identified several changes we will make to our process, tools, and training; however, we need to do more to manage the risk of a major non-conformance.

I would appreciate your help in defining the bright line of "deep enough". From my point of view, the turtles go all the way down.

We have rejected "actionable" as it addresses the immediate cause only. We use the 5-why method to move from immediate cause through causes to root cause. We have identified 3 stopping criteria we believe are defensible - further analysis and corrective actions are outside the company's control (control extends to suppliers willing to cooperate; there are a few who are willing to lose our business rather than assist), the information cannot be obtained, and the right person(s) with the needed information took a thought-through risk. The final criterion is the one that concerns me - the value of risk mitigation effort does not justify the cost of going deeper. This is a subjective decision and certain to be challenged.

Thank you.

------------------------------
Alan Berow
Senior Quality Assurance Engineer
West Chicago IL
------------------------------


2.  RE: How to tell when root cause analysis reaches the root cause

Posted 6 days ago
Hi Alan

This is a common issue, it's very easy to stop before finding the true root cause, but this is never a good idea long term.  I would scrap your three criteria for stopping analysis before finding the root cause, in my opinion no reason is a valid reason for stopping an analysis too soon.  Either a problem is important enough that it is worth finding and correcting the root cause or it's not important enough to do any investigation.  In my opinion you either do it properly or you don't bother starting, doing a partial investigation is a waste of the time and effort of everyone involved and can lead to frustration and people wondering why they do root cause problem solving at all, when they don't find the root cause and the problem(s) reoccur.

In my experience a large part of the problem with root cause analysis is that most of the tools used are a form of guessing.

Brainstorming = lots of people making guesses
Fishbone/Ishikawa Diagram = arranging guesses by category
5 Whys = normally a series of guesses etc.

A true root cause investigation needs to be based on data and on going and observing the location where the problem is occurring rather than making guesses in a meeting room.

I've done plenty of guessing in meeting rooms in the past (and this can be quite successful with simple issues where there are only a few possible root causes) but earlier this year I read 'Stop Guessing: The Nine Behaviors of Great Problem Solvers' by Nat Greene and it's changed how I do root cause analysis forever.  (Note: I'm not affiliated with Nat Greene or this book and I don't get anything for promoting it.)

When your auditor says that you're not going deep enough, they're saying that you aren't finding the root cause, that you're fixing something above the root cause.

Here's a simple example I've come across in training a few times:-

You walk into your garage and find you have a flat tyre, you could just change the tyre but you know that this might lead to another flat so you decide to do a root cause analysis.  So you ask:
  1. Why do I have a flat tyre? And you find that there are nails in the floor or your garage and that one (or more of these have caused your flat.  Now you could just clean up the nails and change the tyre but are the nails the root cause? You decide to ask why again to see if there is a deeper cause.
  2. Why were the nails on the floor of the garage? You discover that the shelf the nails were on broke spilling the nails onto the floor.  Now you could fix the shelf, clean up the nails and change the tyre but you decide to ask why again.
  3. Why did the shelf break? Your investigation discovers that the shelf got wet, which weakened it and resulted in it breaking, and this leads you to ask why again.
  4. Why did the shelf get wet? Investigating this leads you to discover that the ceiling is leaking over the shelf, so again you ask why.
  5. Why is the ceiling leaking?  This leads you to find out you have cracked tiles on your roof that leak when it rains.  Could you ask why again at this point, maybe if you ask the correct why, asking why it rains would not be useful since this is not going to lead to something controllable, but asking why are the tiles cracked could lead you to something controllable.

So to fix the flat tyre and resolve the root cause you would need to get the tiles fixed (and probably have the roof checked for more cracked tiles), repair the damage caused by the leak to the ceiling, the shelf and anything else that was damaged, clean up the nails and change the tyre. Cleaning up the nails and changing the tyre would be an adequate short-term fix, but this wouldn't address the root cause.  Most root cause investigations stop at this or possibly include fixing the shelf, but this does not address the root cause and if you stop at this point it's only a matter of time before the problem re-occurs.
If you were brainstorming in a meeting room about the root cause of the flat tyre how likely would it be that you'd come up with a root cause of cracked roof tiles? I'd suggest it's unlikely and yet this is a fairly simple chain of cause and effect, or effect traced back to cause.

Here's why I don't think any of your three stopping criteria are defensible:
While I can understand that the investigation becomes much harder when it passes into the realm of a supplier this doesn't have to (nor should it) end the investigation you just need to get the supplier involved.  So this should not be a criteria for stopping before finding the root cause.
'The information cannot be obtained' I would question the validity of this, if someone has made a 'thought-through risk' then they've based this thinking on something, there will be data or criteria behind this decision and this can be reviewed.
'The value of risk mitigation effort does not justify the cost of going deeper', while I agree that the cost of fixing something can be more than the benefit of fixing it, you cannot know this for sure if you don't know what the root cause is.  You're effectively saying it costs too much to fix the root cause, we're sure of this even though we don't know what the root cause is and therefore cannot know what measures would be needed to fix it.  How you would ever justify this is a mystery to me.

Sorry this ended up being so long, hopefully something in there will be of use to you.

Best Wishes,

------------------------------
Claire Everett
Prosegur Australia
St Leonards
(61)294909926
------------------------------



3.  RE: How to tell when root cause analysis reaches the root cause

Posted 5 days ago
Hi Claire -

Thanks for getting back to me and for the guidance. I will look into Nat Greene's book.

I apologize that I failed to provide some information that may be critical to your response and that I need to be vague about my company's products and services. My company converts data to information and then sells that information for others to turn into actionable knowledge. Events that lead to defects are rare, but have large consequences. Because of this, information about the defect is often automatically logged, and the people involved are usually on the RCA team, we generally have the information we need to find the cause and begin the root cause analysis even on a conference call. Root cause analysis often moves to business or cultural causes - Why didn't the team ask for clarification of an unclear requirement?". Significant time may have passed between introducing the defect and its being found. If the people on the team left the company in that time, we may not be able to get the answer, thus that stopping point and focusing solutions on finding our whether other teams take the same risk.

Using the 5-why method, you can always ask, "Why?" again. In your example, I would ask why the tiles were cracked. Let's say the answer is because they were damaged by hail. While I cannot control hail (my "outside of company control" stopping point), I could also ask why I did not inspect the tiles after a hail storm. At this point my "value of going deeper is less than cost" might be evoked and the root cause is failure to inspect for tile damage after a hail storm. Preventive actions would include looking for damage in other places after hail and other potentially damaging weather events and telling my neighbors. Unfortunately, our auditor has a way of abstracting the analysis and asking another "Why?" in a different direction. In your example, we might have stopped with hail is outside our control and he would ask why we did not have a risk management plan that included inspecting. Alternatively, we may have come up with not inspecting and he might ask why this was not in our household risk management plan. This is why we need bright line guidance on knowing we have reached the root cause.

------------------------------
Alan Berow
Senior Quality Assurance Engineer
West Chicago IL
------------------------------



4.  RE: How to tell when root cause analysis reaches the root cause

Posted 5 days ago
​Alan:

1) I'm certainly going to add "Stop Guessing" by Nat Greene to my reading list!  Thanks, Claire!

2) I like the challenges that Claire has presented to you wrt your three criteria. But let's reframe the argument against Stop Reason #3: it's not that the (unknown) cost of fixing is greater or less than the cost of poor quality.  It's that the cost of the INVESTIGATION is more than the cost of the poor qualify.

3) So I would further ask if there aren't statistical / specification adjustments that ought to be made which accommodate criterion #3, stopping the investigation when it does not make economic sense, or whether a better business case calculation should be applied before invoking #3.  The "for instance" that I can easily envision is the cost of downtime for a machine or production line far exceeding the improved value you get from eliminating its error rate - both the opportunity costs from not producing during the maintenance (investigation) window, and the cost of maintenance itself.  If those two costs exceed the cost of quality that comes from the error rate of the machine, then I agree with the criterion to stop the investigation.

3a) But what are the specs that this machine is trying to live up to?  If, in fact, your process is producing "off-quality" product, out of spec through an outside-control-limits (non-random causes) process, then when you cannot economically fix it, shouldn't you adjust your specification limits?  The stats being what they are, with the new specs you'll still be statistically operating with non-random causes producing sometimes unsellable product (or downstream "unusable" material).  But according to the new specs, it will not be "defective" product, and you will not trigger the RCA on the upstream producer because you're still within its capability to meet specification.  The same would hold with external, upstream suppliers.

3b)  But then this is where the BCA comes in.  Your upstream supplier will sell you, or your upstream production unit will make, within-spec-and-within-its-control-limits-but-still-unusable product.  You have a cost of waste material.  Turning attention downstream, to consumer processes or to customers, you're going to produce a certain amount of unsellable waste material (items, products,...) that have their own opportunity costs.   You not only have wasted material, but you have wasted production time, time spent producing the unsellable stuff.  Plus the risk of unsatisfied customers and potential lost business there.  Are ALL THOSE costs being included in the BCA determining whether to stop the RCA?  If so, then I would argue you're ok.  But as in (3)(a) you have to change the specifications for what's "on-quality" for the upstream processes such that you can accept their waste production.  If not, then Claire's argument holds and it's not worth stopping the RCA - in fact, it's not only economically justified but economically mandated that you continue with it.  [More often than not, this is where zealots are found to be right - once ALL the costs are thrown into the business case, it makes sense to keep driving and driving and driving ...]

4) Finally, in terms of where you trigger RCAs, I think another implication of (3)(a) is that you have to define a "Search" process - not "Inspection!!" - between the unfixable upstream supplier and your consuming process, or between your unfixable production process and what you offer for sale to your customer.  You have X amount of incoming material, some of which is unusable, or X amount you've produced, some of which is unsellable.  It's all within your newly-adjusted receiving or production specs, but still not all of it is usable, or sellable.  So there's a Search step that has to be undertaken to sort the wheat (usable) from the chaff (unusable).  The incoming stuff, or your product, needs to be directed to downstream uses, or it needs to be re-directed to waste / recycle.

Is this Lean?  Definitely not.  Economically justified?  Maybe.  But it's part of what needs to be included with ALL relevant costs in the business case that stops or continues the RCA, and it should be a natural outcome of accepting new specs that receive or produce unusable or unsellable material.
​​​
Hope this adds positively to the discussion.
--Andy
​​

------------------------------
Andy Higgins
AJHiggins LLC
Transformational Consulting
Centerville, OH
(757) 621-1688
drew.j.higgins@icloud.com
------------------------------



5.  RE: How to tell when root cause analysis reaches the root cause

Posted 5 days ago
Hi Andy -

Thanks. I agree with #1; available from Amazon at a reasonable cost.

Thank you for clarifying #2; it is cost of analysis exceeding the value of expanding the scope of preventive actions - our RCAs find corrective actions and eliminate the cause of the defect and some of the causes of the causes. The issue is how do we know that we have found a set of root causes our auditor will accept as root causes. This may impact #3.

For number #3, I will repeat my apology to Claire and the additional information I finally provided...I need to be vague about my company's products and services. My company converts data to information and then sells that information for others to turn into actionable knowledge. Events that lead to defects are rare, but have large consequences.

Our data come from reality, which is constantly changing. We periodically meet with our customers to discuss how fresh and accurate the data need to be. We collect the data where a defect could lead to a safety issue or product recall. Relaxing these specifications is not an option.

For data we purchase, the cost of scrapping a datum is small. We start our analysis by identifying whether we were the cause. If we were not, our root cause analysis has two branches. The first is asking our supplier for a root cause analysis, which we may or may not get. The second is analyzing our incoming data acceptance criteria/methods and outgoing information verification for causes and root causes.

We do not do a formal cost analysis when deciding whether to terminate an analysis. While we may be naive (a.k.a. wrong), we have not developed an estimation model. I will ask our team whether they see value in creating one.​​​​​

------------------------------
Alan Berow
Senior Quality Assurance Engineer
West Chicago IL
------------------------------



6.  RE: How to tell when root cause analysis reaches the root cause

Posted 5 days ago

Sounds like your auditor is trying to insert him/herself into your company's business. Nowhere in the standard does it or will it ever say anything about "deep enough", which is clearly a subjective thing. He/She probably needs to be reminded that your company is the customer of his/her company, and by extension, he/she is a supplier to you and your company. This is obviously not so easily communicated, but it is still valid. Nothing says you cannot challenge the auditor. If he/she tries that major NC threat nonsense, he/she will need to enumerate exactly what aspect of the standard you are failing to meet.

 

The final statement in 10.2.1 is, "Corrective actions shall be appropriate to the effects of the nonconformities encountered."

 

As long as you can provide evidence that your company has met the requirements of 10.2.1 a-f, he will be hard pressed to issue a nonconformance.

 

Also notice that 10.2.1 e and f state that (paraphrased), (e) the organization will update risks and opportunities determined during planning (RCA) as necessary, and (f) make changes IF necessary – meaning that, if you have identified risks, you should communicate them (like the cost of action vs. the benefit and probability of compliance from vendors) and IF actions are necessary, based on all factors (including risk), take them.

 

Don't let him or her push you guys around.

 

 

David Frye

Director of Quality

Pierce Distribution Services

david.frye@pdsc.biz

815-963-2841 (EXT 2229) - Office

815-298-1053 - Cell

logoCorp

www.PierceDistribution.com          P Please consider the environment before printing this e-mail.

The information contained in this message may be privileged,
confidential, and protected from disclosure. If the reader of this
message is not the intended recipient, or any employee or agent
responsible for delivering this message to the intended recipient, you
are hereby notified that any dissemination, distribution, or copying of
this communication is strictly prohibited. If you have received this
communication in error, please notify us immediately by replying to the
message and deleting it from your computer.

 






7.  RE: How to tell when root cause analysis reaches the root cause

Posted 5 days ago
Hi David -

Thanks. Several people on my team, including my manager agree with you and are considering appealing his finding. Our auditor has an insect up his ventral orifice about root cause analysis. The section he used is 10.2.1 b 2) requires that the organisation shall determine the causes of nonconformances. The finding was "Causes identified do not always identify the true causes of Nonconformances." "True causes" is subjective, thus our need for bright line guidance on true cause /root cause.

On the other hand, when our auditor looks at the 5-why we used to find the root cause and asks, "Why?" again, I silently say, "Oh s**t", which tells me we are not as good as we think we are. We are updating our process to use the stop criteria I mentioned in the original post along with training, embedded facilitators, adding a RCA quality check to our internal audits, and looking at the cause of why we did not predict the defect (risk management) and why the defect escaped. For the worst defects, we would also look at their root causes.

We need to ensure this is not an issue in the next audit, thus the need to provide guidance to the RCA teams.

------------------------------
Alan Berow
Senior Quality Assurance Engineer
West Chicago IL
------------------------------



8.  RE: How to tell when root cause analysis reaches the root cause

Posted 5 days ago
Hi Everyone

Firstly, Alan, thanks for adding clarification around your specific circumstances and for your ideas on where the example 5 Why analysis I shared could continue.  This does indeed nicely demonstrate that people can have different ideas on what constitutes 'deep enough' which is your original problem with your auditor.

Andy - Thanks for your input on point 2, you make a good point regarding the cost of the investigation being more than the cost of the current defect. It does however lead me to ask how 'you' can be sure the investigation costs more than the current defect plus future defects from the same cause, this seems like it could be tough to justify, although it could still be correct especially in Alan's situation where occurrences are infrequent.

David - Great point about challenging the auditor, if they're going to record a non-conformance they need to be able to give specific justifiable reasons for how you are failing to meet the criteria.  In this case the criteria doesn't even specifically state you need to use root cause analysis so I'm not sure how the auditor can justify this finding.  However that doesn't mean that they don't have a point about the robustness of the root cause analysis and there could still be a valuable opportunity for improvement here.

Putting aside the fact that this is an audit finding let's look at it from another angle.  Finding the true root cause of problems has value to the business ISO 9001 and auditors aside.  Joseph Juran's law of 10 is about the ripple affect of a small error and the cost of correcting it at different stages:
  • $1 at the drawing board
  • $10 in verification
  • $100 in manufacture
  • $1,000 in assembly
  • $10,000 in commissioning
  • $100,000 in field retrofits
  • $1,000,000 in litigation
I apply this to root cause investigation by thinking if it costs $1 to address the true root cause it costs $10 to fix the next level up and $100 the following level up etc.  This means that finding and resolving the true root cause is the best option from an ongoing cost perspective, of course you are looking at a situation where there are only a few occurrences which potentially makes this less of a factor for you, but your multiplying factor could be far higher than 10, which could make it more valid.

It's also important to consider that one root cause can result in more than one type or error so finding one root cause can result in eliminating the possibility of multiple unwanted outcomes.  It's for these reason that I think it's potentially valuable for your company to look at whether your root cause analysis is robust enough even if you can argue that the auditors finding is not valid/justifiable.

The problem is that while some situations make it obvious that you've found and addressed the root cause in other's this can be hard to identify/prove.  From what you've been able to say about your situation I suspect it's going to be difficult for you to show that the root cause you've identified in the true root cause except where you can show that your resulting solution is successful in preventing the issue reoccurring.

Even when you can show that a particular 5 Whys path is finished and there are no further questions to be asked after the final why someone can still say "if you asked a different why question at point X would this lead you to a different outcome?" and this is possible with many other RCA tools as well.

Which brings me back to the fact that the auditor has said your RCA's don't go deep enough, this as David said is a subjective statement that  doesn't seem to be supported by the ISO 9001 criteria.  My suggestion is that you argue the finding is not valid, but that you also look at whether you need to make changes to your RCA process to ensure you getting the benefits of finding and correcting the true root cause.

Best wishes

------------------------------
Claire Everett
Prosegur Australia
St Leonards
(61)294909926
------------------------------



9.  RE: How to tell when root cause analysis reaches the root cause

Posted 4 days ago
Hi All - Thanks for the information and advice. From that, my plan is to ask a few teams whether they are open to using the new model - explicitly checking whether the team has the correct people before answering the "Why?", explicitly stating why the analysis ended, looking for branches into new areas to explore -  with me sitting in as an observer rather than a facilitator. I will focus on the depth of the root cause and how well the new model is working.

WhenI get sufficient insight, I will ask others to do the same, then schedule katas for all facilitators to share the knowledge.With that and the modified process rolled out, we may be in a position to push back if the auditor disagrees with why we stopped. As added insurance, we will take up our auditor on his offer to look at several analyses and provide feedback; however, he is busy and may not be able to find the time.

Hi Claire - I have started reading Nat Greene book. I may not agree with everything I have read so far; however, he has given me some things to think about and I have a lot of the book to complete. I will keep an open mind.

------------------------------
Alan Berow
Senior Quality Assurance Engineer
West Chicago IL
------------------------------



10.  RE: How to tell when root cause analysis reaches the root cause

Posted 4 days ago
​This has been an interesting thread.  Many people and organizations have struggled with how deep to go on a root cause analysis and there is no definitive answer.  I agree with many of the comments that you need to challenge the auditor and the nonconformance if they write one.  As long as you are complying with 10.2, then they cannot write a valid nonconformance for how deep the analysis goes.  The last line, "Corrective actions shall be appropriate to the effects of the nonconformities encountered." is there to help companies with the issue of the cost of fixing deeper root causes.

One way to combat, would be to argue with evidence that you comply with 10.2.1 d "review the effectiveness of any corrective action taken;"  As long as your CA is effective, it shouldn't matter how deep you went.  Let me go back to the flat tire example above:

Will you spend a huge amount of money to install a hail resistant roof so it won't leak after the next hailstorm?  Probably not, although you could.  Would you spend $10 for a plastic container that won't open or break when it hits the floor?  Probably, or at least I would.  You could then show that the root cause you choose to fix was effective.  When it rains, the shelf breaks, but the nails don't scatter on the floor.  The team should realize that there is an incentive to fix the leak (repeated cost of fixing the shelf) and risk in not fixing the leak (potential safety issues), but it's not necessary in order to stop the flat tires.

When your auditor writes the nonconformance, just can keep asking him, "Why?" and keep telling him he hasn't justified the validity of his nonconformance because he hasn't answered all of your 'whys'.  Now that would be funny....

In all seriousness, if your auditor is citing 10.2.1 b 2), remind him that that the "Organization" shall determine the causes of nonconformances, not the auditor.  The nails on the floor, the broken shelf, the leaking roof, and even the hail damaging the roof are all causes.  Notice that nowhere is the term "ROOT" when referring to causes in section 10.2. 



------------------------------
Dave Carroll
Quality Assurance Engineer
------------------------------



11.  RE: How to tell when root cause analysis reaches the root cause

Posted 4 days ago

BINGO!!!

You hit it on the head, David.

The organization not the auditor, the organization ensuring the effectiveness of the actions taken, and the lack of "ROOT" all speak to the fact that the auditor is NOT in charge of ANYTHING.

 

Loved idea of reversing the why-why analysis on him for the nonconformance.

 

 

David Frye

Director of Quality

Pierce Distribution Services

david.frye@pdsc.biz

815-963-2841 (EXT 2229) - Office

815-298-1053 - Cell

logoCorp

www.PierceDistribution.com          P Please consider the environment before printing this e-mail.

The information contained in this message may be privileged,
confidential, and protected from disclosure. If the reader of this
message is not the intended recipient, or any employee or agent
responsible for delivering this message to the intended recipient, you
are hereby notified that any dissemination, distribution, or copying of
this communication is strictly prohibited. If you have received this
communication in error, please notify us immediately by replying to the
message and deleting it from your computer.