Guide to Failure Mode and Effects Analysis (FMEA)

Updated: January 9, 2024

Reading Time: 7 minutes

Articles

With over two decades in business – spanning strategy consulting, tech startups and executive leadership – I am committed to helping your organization thrive.

At Reliability, we’re on a mission to help enhance strategic decision-making and operational excellence through the power of Root Cause Analysis, and I hope this article will be helpful! 

Our goal is to help you better understand Failure Mode and Effects Analysis (FMEA) by offering insights and practical tips based on years of experience. Whether you’re new to doing root cause analyses or a seasoned pro, we trust this will be useful in your journey towards working hard and working smart.

———————

Dealing with potential failures and mitigating their impacts forms the core of reliability engineering. A systematic tool known as Failure Mode and Effects Analysis (FMEA), is a method that provides a path to improved reliability and enhanced performance.

What is Failure Mode and Effects Analysis (FMEA)?

Outline of Failure Mode and Effects Analysis

Failure Mode and Effects Analysis, commonly abbreviated as FMEA, is a structured, step-by-step, proactive approach to identifying and analyzing all the conceivable failures in a product, process, design, or service. Conceived during the 1950s within the aerospace industry, this technique has since become a valuable tool across multiple industries.

A ‘failure mode’ in FMEA refers to the way in which a component, subsystem, system, process, or design could potentially fail to meet the expected performance standards. These failure modes are then evaluated for their potential effects on the system or end-users, establishing a comprehensive understanding of the potential risks associated.

The ultimate objective of FMEA is to quantify and prioritize the risks associated with these identified failure modes to guide the improvement efforts. It does this by assessing three key metrics for each failure mode: severity (the potential impact of the failure), occurrence (the likelihood of the failure happening), and detection (the ability to discover the failure before it becomes an issue).

An integral part of FMEA is the calculation of the Risk Priority Number (RPN), which is a numerical value derived from the severity, occurrence, and detection ratings. This RPN allows for the ranking of failure modes in terms of their potential risk, assisting in the prioritization of improvement actions.

FMEA, empowers organizations to anticipate potential problems and implement corrective measures before any failure occurs. This makes it a powerful technique in enhancing the reliability, safety, and quality of products or processes, ultimately leading to better customer satisfaction and cost savings.

An Example of Failure Mode and Effects Analysis

Imagine an automotive manufacturer planning to release a new car model. An FMEA could be performed on the braking system:

Failure ModeEffectsSeverityOccurrenceDetectionRisk Priority Number (RPN)
Brake failureAccident,
harm to occupants
934108
FMEA For Brake failure in Automotive Planning

In this case, the Risk Priority Number (RPN) is high, indicating a high-priority issue that needs to be addressed immediately to reduce the potential risk.

Why Perform FMEA?

FMEA isn’t just another item on a checklist; it’s a stepping stone towards excellence. Performing an FMEA brings reliability into focus, helping teams to understand potential failures and their impacts. The technique turns hindsight into foresight, allowing for proactive measures and solutions. It builds a culture of anticipation and prevention rather than reaction.

When to Perform FMEA

FMEA can be seen as a beneficial tool to deploy in a variety of scenarios. Essentially, anytime a proactive, systematic approach to risk management and failure prevention could be beneficial, that’s a suitable moment to perform an FMEA.

Typically, FMEA is conducted during the design phase of a product, system, or process to spot potential failure modes and address them proactively. For instance, in the product development lifecycle, it is often employed during the early design phases where changes can be made without significant cost or time penalties.

However, FMEA isn’t confined to the early stages. It can be used throughout the entire lifecycle of a product or process to identify failures and their causes, including during the manufacturing phase, upon receipt of customer feedback, or even when a product or process undergoes significant changes.

Moreover, FMEA is crucial in industries where the cost of failure is very high, such as aerospace, automotive, healthcare, and nuclear energy. It’s also employed when there’s an introduction of new technologies or when modifications are made to design codes, standards, or regulations.

How to perform FMEA 

Let’s unfold the FMEA Process Step-by-Step. FMEA is an exhaustive process, designed to identify potential product or process failures and anticipate their implications.

FMEA Process Step-by-Step

It presents a structured way to identify what could go wrong (failure modes), understand the consequences (effects), and estimate the gravity (severity). Here’s a step-by-step guide to performing an FMEA:

  1. Select a Process for Analysis

Begin by choosing a process that has exhibited problems or is typically troublesome in your organization or industry. This selection lays the groundwork for the analysis to follow.

  1. Formulate the FMEA Team

Organizational leadership should offer a clear project charter, setting the stage for the analysis. The team, led by a facilitator appointed by the leadership, consists of individuals directly involved in the process under analysis. This team should include representatives from diverse areas like design, manufacturing, quality control, sales, customer service, and more.

  1. Outline the Process

Delineate the process steps in a way that ensures everyone on the team fully comprehends the process under scrutiny. This clarity sets the stage for a thorough analysis.

  1. Identify Potential Failures

With the process clearly defined, have the team brainstorm potential failure modes for each step. The people working closely with the process are often best placed to spot these possible issues.

  1. Prioritize the Problems

Not all problems are created equal. Some issues might occur frequently or have a more substantial impact on product quality or safety. These issues warrant priority in your improvement efforts.

  1. Plan and Implement Remedial Measures

The team then devises strategies to reduce the likelihood of failure modes or mitigate their effects. These changes could relate to design alterations, process adjustments, or enhancement of detection mechanisms.

  1. Evaluate the Effectiveness of Changes

Like any improvement initiative, it’s crucial to measure the success of the implemented actions. The team should assess whether the changes made have effectively reduced the risk of failure or its impact.

Once the team completes the analysis, they create an FMEA table detailing potential failures, their effects, and associated severity, occurrence, and detection rankings. These rankings range from 1 to 10, with 1 indicating a low impact and 10 showing a high impact.

By multiplying these rankings (Severity x Occurrence x Detection), the team obtains a Risk Priority Number (RPN), which serves as a guide to prioritize work based on the severity and likelihood of failure. Any recommended actions should then address these prioritized risks.

The living FMEA document, continually updated, serves as an invaluable resource for the development of derivative products and a repository of organizational knowledge. Critical characteristics, regulatory compliance needs, process controls, and other relevant data are also recorded here, offering a comprehensive view of potential risks and their mitigation strategies.

Criteria for FMEA Analysis

In a comprehensive FMEA study, every potential failure is assessed and evaluated based on three fundamental criteria. These criteria allow the team to grasp the significance of each failure mode, how frequently it’s likely to surface, and how readily it could be detected before resulting in any adverse impact.

  • Severity: This measures the gravity of the potential impact on the customer or end-user if the failure were to occur. It could be a minor inconvenience, a significant disruption, or a serious safety issue. Each failure mode is scored on a scale from 1 to 10, with 1 indicating a negligible effect and 10 indicating a catastrophic or critical outcome.
  • Occurrence: This gauges the likelihood of the failure mode occurring. It takes into consideration the frequency at which the problem is likely to transpire during the life of the system, product, or process. The team assigns a score from 1 to 10, where 1 suggests a rare event and 10 signifies a nearly certain occurrence.
  • Detection: This reflects the ability to identify or predict the failure before it manifests itself in an operational scenario. The aim here is to evaluate how robust the current controls are in catching or preventing the failure. A score between 1 and 10 is assigned, with 1 denoting an extremely high likelihood of detection and 10 representing a situation where the failure is almost certain to go undetected.

The team participating in the FMEA study must reach a consensus on the ranking for severity, occurrence, and detection for every failure mode. While the FMEA process is primarily qualitative in nature, it is beneficial to utilize any available data to substantiate the decisions regarding these rankings. This makes the analysis more precise and helps to mitigate any potential biases or assumptions. Remember, the ultimate goal of FMEA is to prioritize risks and facilitate the development of effective strategies to manage them.

Final Thoughts

Failure Mode and Effects Analysis (FMEA) is a robust and systematic approach to anticipate, prioritize, and mitigate risks. It helps organizations preemptively address possible hitches in their processes or products, paving the way for higher safety, improved quality, and increased customer satisfaction. Despite its complexity, the returns it offers in risk mitigation are invaluable, making FMEA a tool of choice for organizations keen on proactive problem-solving and continuous improvement. It’s not just about solving problems – it’s about predicting and preventing them. That is the power of FMEA.

At Reliability, we’re here to help guide your journey through such complex yet rewarding methodologies. Empower your organization by learning more about Root Cause Analysis, as well as other valuable tools and strategies for achieving and maintaining quality and reliability. Get in touch with us today to see how we can facilitate your journey to improved reliability and heightened business performance.

———————

I hope you found this guide to Failure Mode and Effects Analysis insightful and actionable! Stay tuned for more thought-provoking articles as we continue to share our knowledge. Success is rooted in a thorough understanding and consistent application, and we hope this article was a step in unlocking the full potential of Root Cause Analysis for your organization.

Reliability runs initiatives such as an online learning center focused on the proprietary PROACT® RCA methodology and EasyRCA.com software. For additional resources, visit Reliability Resources.

Tags

Root Cause Analysis Software

Our RCA software mobilizes your team to complete standardized RCA’s while giving you the enterprise-wide data you need to increase asset performance and keep your team safe.

Get Free Team Trial

Root Cause Analysis Training

Your team needs a common methodology and plan to execute effective RCA's. With both in-person and on-demand options, our expert trainers will align and equip your team to complete RCA's better and faster.
View RCA Courses

Reliability's root cause analysis training and RCA software can quickly help your team capture ROI, increase asset uptime, and ensure safety.
Contact us for more information: