PowersYourTeam

๐Ÿ”ง RCM2 Reference Guide

Reliability Centered Maintenance - The Seven Questions Methodology

Overview
The 7 Questions
Failure Patterns
Consequences
Task Selection
Implementation
๐Ÿ“– What is RCM2?

Reliability Centered Maintenance (RCM) is a structured decision-making process used to determine what must be done to ensure physical assets continue to do what their users require in their present operating context.

RCM2 was developed by John Moubray in the late 1980s, adapting the original aviation RCM methodology (Nowlan & Heap, 1978) for industrial applications. It is the most widely used and rigorous form of RCM, fully compliant with SAE JA1011.

RCM2 Definition (John Moubray)

"A process used to determine what must be done to ensure that any physical asset continues to do whatever its users want it to do in its present operating context."

Key Principle

RCM focuses on preserving system FUNCTION, not just preventing equipment failure. The consequences of failure drive maintenance decisions, not the failure itself.

๐ŸŽฏ RCM2 Objectives
1
Preserve Functions

Identify and protect the functions that matter most to operations, safety, and environmental compliance.

2
Identify Failure Modes

Determine all the ways each function can fail and the causes (failure modes) that are reasonably likely to occur.

3
Prioritize by Consequence

Classify failures by their consequences: hidden, safety/environmental, operational, or non-operational.

4
Select Effective Tasks

Choose maintenance tasks that are technically feasible AND worth doing based on consequences.

5
Optimize Maintenance

Achieve the right balance of proactive and reactive maintenance at minimum total cost.

๐Ÿ“Š The RCM2 Process Flow
FUNCTIONS What must it do? Q1 FUNCTIONAL FAILURES How can it fail? Q2 FAILURE MODES What causes it? Q3 FAILURE EFFECTS What happens? Q4 CONSEQUENCES Does it matter? Q5 PROACTIVE TASKS Prevent it? Q6 DEFAULT ACTIONS If not? Q7 FMEA (Information Worksheet) Decision Diagram
๐Ÿ“œ Historical Background
1960s - Aviation Origins

Maintenance Steering Group (MSG) develops systematic maintenance for Boeing 747

1978 - Nowlan & Heap Report

United Airlines publishes "Reliability-Centered Maintenance" for US Department of Defense

1990 - RCM2 Developed

John Moubray adapts RCM for industrial applications, creates RCM2 methodology

1999 - SAE JA1011

SAE publishes standard defining minimum criteria for RCM processes

Today - Industry Standard

RCM2 used globally across aviation, military, power generation, manufacturing, and more

โœ… When to Apply RCM2

Ideal Applications

  • Critical assets with high failure consequences
  • Safety-critical systems and protective devices
  • Complex equipment with multiple failure modes
  • Assets with high maintenance costs
  • New equipment without historical data
  • Equipment with changing operating context

โš ๏ธ May Not Be Cost-Effective For

  • Simple, low-cost equipment
  • Items with obvious failure modes
  • Equipment easily replaced on failure
  • Assets with no safety/operational impact
โ“ The Seven Questions of RCM2

RCM2 answers these seven questions in order for each asset in its operating context. The first four questions form the FMEA (Information Worksheet), while questions 5-7 drive the Decision Diagram.

Q1
What are the FUNCTIONS and associated performance standards of the asset in its present operating context?
Functions describe what the asset must DO, not what it IS. Every function statement must include:
  • Verb: The action (e.g., pump, transfer, contain)
  • Object: What is acted upon (e.g., water, oil, pressure)
  • Performance Standard: Quantified where possible (e.g., "at least 500 GPM at 80 PSI")

Example: "To pump water from Tank A to Tank B at not less than 800 liters per minute"

Q2
In what ways can it FAIL to fulfill its functions (Functional Failures)?
Functional Failures are failed states - conditions where the asset cannot meet the performance standard defined in Q1. For each function, identify all ways it can fail:
  • Total loss: Complete inability to perform function
  • Partial failure: Performs below required standard
  • Over-performance: Exceeds acceptable limits

Example: "Unable to pump any water" or "Pumps less than 800 L/min"

Q3
What causes each functional failure (Failure Modes)?
Failure Modes are the specific events that cause functional failures. Include all failure modes that are reasonably likely to occur in the operating context:
  • Failures that have occurred before on this or similar equipment
  • Failures being prevented by existing maintenance
  • Failures that haven't happened but are realistic possibilities

Examples: Impeller worn, Seal failure, Bearing seizure, Motor winding failure, Coupling broken

Q4
What happens when each failure occurs (Failure Effects)?
Failure Effects describe what happens when each failure mode occurs - enough information to evaluate consequences. Include:
  • Evidence that failure has occurred (alarms, visible signs, etc.)
  • Safety or environmental hazards
  • Effect on production or operations
  • Physical damage caused
  • What must be done to repair the failure
Q5
In what way does each failure MATTER (Failure Consequences)?
Consequences determine the priority and type of maintenance task. RCM2 classifies failures into four categories:
  • Hidden: Failure not evident under normal conditions (protective devices)
  • Safety/Environmental: Could injure/kill someone or breach environmental standards
  • Operational: Affects production, quality, or customer service
  • Non-Operational: Only involves direct cost of repair
Q6
What should be done to PREDICT or PREVENT each failure (Proactive Tasks)?
Proactive Tasks are performed before failure to prevent or detect impending failure:
  • On-Condition Tasks: Detect potential failure (PdM, condition monitoring)
  • Scheduled Restoration: Restore to original capability at fixed intervals
  • Scheduled Discard: Replace at or before specified life limit

A task is only selected if it is technically feasible AND worth doing based on consequences.

Q7
What should be done if a suitable proactive task cannot be found (Default Actions)?
Default Actions are used when no proactive task is appropriate:
  • Failure-Finding: Scheduled checks for hidden failures
  • Redesign: Modify equipment, procedures, or training
  • Run-to-Failure: Allow failure to occur (only for non-safety consequences)

Important: Run-to-failure is NEVER acceptable if failure has safety or environmental consequences!

๐Ÿ“ˆ The Six Failure Patterns

Research by Nowlan & Heap (1978) revealed that equipment doesn't always fail in predictable, age-related patterns. These six patterns show the conditional probability of failure over time.

Pattern A - Bathtub Curve

High infant mortality, low random, then wear-out zone

4%
Pattern B - Wear-Out

Low random failure, then increasing probability at end of life

2%
Pattern C - Fatigue

Gradually increasing probability of failure, no distinct wear-out

5%
Pattern D - Initial Break-In

Low initially, rapid rise to constant random level

7%
Pattern E - Random

Constant probability of failure at any age

14%
Pattern F - Infant Mortality

High initial failure, then low random probability

68%
๐Ÿ’ก Key Insights

Critical Finding

Only 11% of failures are age-related (Patterns A, B, C). The remaining 89% are random or have infant mortality characteristics. Time-based maintenance alone cannot prevent most failures!

Age-Related Failures (11%)

  • Patterns A, B, C show increased failure probability with age
  • Often involve contact with product: erosion, corrosion, abrasion
  • Can be addressed with scheduled restoration or discard
  • Time-based PM can be effective

Random/Infant Mortality Failures (89%)

  • Patterns D, E, F show no predictable wear-out point
  • Time-based PM provides little or no benefit
  • Often complex equipment with multiple failure modes
  • Condition monitoring (PdM) is the best approach
  • Focus on precision maintenance to prevent induced failures

Pattern F - Why 68%?

Most failures attributed to Pattern F are caused by human error during installation, maintenance, or operation. Proper training, procedures, and precision practices can significantly reduce these failures.

๐Ÿ“Š Research Studies Comparison
Pattern Type UAL Study (1978) Broberg (1973) SUBMEPP (1993) Effective Maintenance
A - Bathtub Age-Related 4% 3% 6% Scheduled discard/restoration after burn-in
B - Wear-Out Age-Related 2% 1% 17% Scheduled discard before wear-out point
C - Fatigue Age-Related 5% 4% 3% Condition monitoring for degradation
Total Age-Related 11% 8% 26%
D - Break-In Random 7% 11% 6% Condition monitoring
E - Random Random 14% 15% 42% Condition monitoring, redundancy
F - Infant Mortality Random 68% 66% 29% Precision maintenance, proper procedures
Total Random 89% 92% 77%

UAL = United Airlines (aviation), Broberg = Swedish study, SUBMEPP = US Navy Submarine Maintenance

โš ๏ธ Failure Consequence Categories

RCM2 first asks whether failure is evident to operators under normal circumstances. Then it classifies evident failures by their impact. This determines the type and urgency of maintenance required.

โ˜ ๏ธ
Safety/Environmental

Failure could injure or kill someone, or could breach environmental regulations or standards.

Priority: HIGHEST

Must eliminate or reduce risk to tolerable levels

๐Ÿญ
Operational

Failure affects production output, product quality, customer service, or operating costs beyond repair.

Priority: HIGH

Task must reduce total cost of failure

๐Ÿ”ง
Non-Operational

Failure involves only the direct cost of repair - no safety, environmental, or production impact.

Priority: LOWEST

Task must cost less than repair cost

๐Ÿ” Hidden vs Evident Failures

Evident Failure

Someone will become aware of the failure under normal circumstances - through alarms, obvious malfunction, quality defects, etc.

Hidden Failure

The failure will not be noticed unless a specific check is made, OR until a demand is placed on the device (e.g., safety device needed during emergency).

โš ๏ธ The Danger of Hidden Failures

Hidden failures expose the organization to multiple failures - if a protective device has already failed silently, and the protected equipment then fails, the consequences can be catastrophic.

Key Principle: Hidden failures alone have no direct impact. Their consequences only become apparent when combined with another failure. This is why failure-finding tasks are essential for protective devices.

๐Ÿ“‹ Consequence Decision Flow

Is the failure evident to operators during normal operation?

โ†“

NO โ†’ HIDDEN

โ†“

YES โ†’ EVIDENT

For EVIDENT failures: Does it have safety or environmental consequences?

โ†“

YES โ†’ S/E

โ†“

NO โ†’ Check operational

Does it affect operations (production, quality, cost)?

โ†“

YES โ†’ OPERATIONAL

โ†“

NO โ†’ NON-OPERATIONAL

โœ… Proactive Tasks (Q6)

Tasks performed before failure to prevent or predict failure. Must be technically feasible AND worth doing.

๐Ÿ”
On-Condition Tasks (Condition Monitoring)

Detect potential failure with enough warning to act. Requires detectable P-F interval. Examples: vibration analysis, thermography, oil analysis, ultrasound.

๐Ÿ”„
Scheduled Restoration Tasks

Restore to original capability at fixed intervals regardless of condition. Requires age-reliability relationship and identifiable "life" point.

๐Ÿ—‘๏ธ
Scheduled Discard Tasks

Replace at or before specified life limit regardless of condition. For items where restoration is impractical or where failure is catastrophic.

Task Selection Hierarchy

RCM2 prefers tasks in this order: On-Condition โ†’ Scheduled Restoration โ†’ Scheduled Discard. On-condition is preferred because it bases action on actual condition, not assumed age.

๐Ÿ”ง Default Actions (Q7)

Used when NO proactive task is technically feasible or worth doing.

๐Ÿ”Ž
Failure-Finding Tasks

Scheduled checks to discover hidden failures before a demand. Used for protective devices. Interval based on acceptable unavailability.

โœ๏ธ
Redesign / One-Time Change

Modify hardware, operating procedures, or training. Required when consequences are intolerable and no suitable task exists.

๐Ÿ’ฅ
Run-to-Failure (No Scheduled Maintenance)

Allow failure to occur, then repair. ONLY acceptable if failure has NO safety or environmental consequences.

โš ๏ธ Critical Rule

Run-to-failure is NEVER acceptable for failures with safety or environmental consequences! If no proactive task works, redesign is mandatory.

๐Ÿ“Š Task Selection by Consequence Type
Consequence Proactive Task Criteria If No Proactive Task
Hidden Task must reduce risk of multiple failure to tolerable level Failure-finding task is mandatory. If not possible, redesign
Safety/Environmental Task must reduce risk of failure to tolerable level (or eliminate) Redesign is mandatory. Run-to-failure NOT acceptable
Operational Total cost of task must be less than total cost of failure over time Run-to-failure may be acceptable if economically justified
Non-Operational Cost of task over time must be less than cost of repair Run-to-failure is acceptable (direct repair cost only)
๐Ÿ“ˆ The P-F Curve & Interval
Time Condition P Potential Failure (Detectable) F Functional Failure P-F Interval Task Interval (Less than P-F) Normal Operation New

P-F Interval Rules

  • P (Potential Failure): Point where deterioration becomes detectable
  • F (Functional Failure): Point where asset can no longer perform required function
  • P-F Interval: Time between P and F - the warning period
  • Task Interval: Must be less than P-F interval to catch failure before it occurs
  • Rule of thumb: Task Interval = P-F Interval รท 2
๐Ÿ‘ฅ RCM2 Team Composition

RCM2 analyses are conducted by cross-functional teams, not individuals. Typical team of 6-8 members:

F
RCM Facilitator

Guides the process, ensures rigor, trained in RCM methodology. Does NOT need to know the equipment.

O
Operations Representative

Knows what the equipment must do, operating context, production requirements.

M
Maintenance Technician(s)

Knows how equipment fails, repair history, maintenance practices.

E
Engineering

Technical expertise, design intent, modifications, OEM knowledge.

S
Safety/Environmental

Regulatory requirements, hazard identification, environmental compliance.

Key Success Factor

The team must include people who operate and maintain the equipment daily. They have knowledge no document can capture. RCM2 captures institutional knowledge before it's lost.

๐Ÿ“ RCM2 Worksheets
Information Worksheet (FMEA)

Records answers to Questions 1-4:

  • System/Subsystem identification
  • Functions and performance standards
  • Functional failures
  • Failure modes
  • Failure effects
Decision Worksheet

Records answers to Questions 5-7:

  • Consequence classification (H/S/E/O/N)
  • Decision diagram responses
  • Selected maintenance tasks
  • Task intervals
  • Responsible person/craft

โš ๏ธ Common Mistakes

  • Rushing through functions - poor functions = poor analysis
  • Missing failure modes (especially hidden failures)
  • Selecting tasks without checking feasibility
  • Not implementing the results in CMMS
  • Treating RCM as one-time exercise (it's living!)
๐Ÿ“‹ Implementation Steps
Phase Step Description Key Outputs
Preparation 1 Select system/equipment for analysis (criticality ranking) Prioritized asset list
2 Define system boundaries and operating context Context statement, boundaries
3 Assemble team, gather documentation Team roster, P&IDs, manuals
Analysis 4 Identify functions and performance standards (Q1) Function list
5 Identify functional failures and failure modes (Q2-Q3) FMEA worksheet
6 Document failure effects (Q4) Completed Information Worksheet
7 Apply decision logic for each failure mode (Q5-Q7) Decision Worksheet, task list
Implementation 8 Review and approve maintenance tasks Approved task list
9 Load tasks into CMMS, assign resources PM work orders
10 Train personnel, begin execution Trained workforce
Living RCM 11 Monitor results, track failures Performance metrics
12 Update analysis based on new data Revised worksheets
๐Ÿ“š Standards & Resources
Standard/ResourceDescription
SAE JA1011Evaluation Criteria for RCM Processes
SAE JA1012Guide to the RCM Standard
IEC 60300-3-11RCM Application Guide
MIL-HDBK-2173US Military RCM Requirements
NAVAIR 00-25-403Naval Aviation RCM Guidelines
RCM2 BookJohn Moubray - "Reliability-Centered Maintenance" (2nd Ed.)
โœจ Benefits of RCM2

Proven Results

  • 40-70% reduction in routine maintenance workload
  • Improved safety through systematic risk identification
  • Better uptime by focusing on functions, not just equipment
  • Lower costs by eliminating unnecessary maintenance
  • Knowledge capture from experienced personnel
  • Regulatory compliance with documented decision process
  • Living program that improves with operational data