A piece of equipment in your plant just failed. You didn't see it coming. You aren't prepared for this. Management is demanding that production starts again. Sound familiar?
STOP! STOP! STOP!
You may be caught in an endless cycle of failure after failure. You continually look for "who to blame" versus figuring out what actually happened.
Failure, Rinse, Repeat! Isn't it time to do something different? Let's look at the entire picture to see this... Ever heard of the Uptime® Elements?
There are many ways that you could have been prepared for this situation, and could've been the hero instead of the guy with all the pressure on you! Lets explore this situation and how you could've been prepared or even better, prevented it. There are many items here that could have potentially stopped this situation or helped it not have such consequences.
I'm going to reference the Uptime Elements as one of the best ways to explain what could have been done here. This framework was created by reliabilityweb.com and is one of the best ways to show these types of situations and all that could have effected the machine.
REM - Reliability Engineering for Maintenance
Is the design of this piece of equipment correct? Was this planned to be able to maintain? Was it used improperly? A well designed (Rcd - Reliability Centered Design) and built machine should have many less failures. These items should be discussed and detailed during the capital planning process (Cp - Capital Project Management). In addition you should have known beforehand how critical (Ca - Criticality Analysis) this piece of equipment was and what the effects of it failing would cause.
ACM - Asset Condition Management
Depending on what the equipment is, you may have been able to avoid the catastrophic failure by utilizing condition monitoring. Depending on the type of equipment, different technologies and practices could have been used to see this failure coming and prevent it from causing downtime. Some of the technologies that can be utilized: vibration analysis (Vib), fluid analysis (Fa), motor testing (Mt), ultrasound testing (Ut), and infrared thermal imaging (Ir). In addition, was it installed for long-term reliability such as laser alignment (Ab - Alignment & Balancing) and was it lubricated properly (Lu - Machinery Lubrication)?
WEM - Work Execution Management
If this machine could cause this type of issues at your facility, was there ever given the thought to having a spare? (Mro - MRO Spares Management) Having a spare is not always the answer but it would've potentially allowed the time to evaluate the failure properly without the rush to get back up and running again. Let's be honest, the downtime still would've occurred, but it would've been substantially reduced.
In addition to having a spare, was the needed preventive maintenance (Pm - Preventive Maintenance) performed? Maybe you are too busy chasing failure after failure and cannot plan properly to perform the scheduled Pm's (Ps - Planning and Scheduling). Did the operator see this machine having issues and not say anything? (Odr - Operator Driven Reliability).
As you can see from the great way that the Uptime Elements display all of the involved items, there are many reasons (and many more other than what is shown here) and actions that can be taken to prevent this failure. None of this matters if you do not have good leadership (LER - Leadership for Reliability) that is supportive and involved in your reliability journey. In addition to this you need the correct business processes (AM - Asset Management) in place to organize all of this, track it, and make it actually happen.
I suggest you reach out to me if the beginning of this article relates to you and the cycle your plant is running on. There is a way out but its not easy, it takes a long time (Rj - Reliability Journey) to create a culture based on reliability. This article is barely touching the surface.
Here is the Uptime Elements chart for you to reference. All credit to this chart and use of the Uptime Elements goes to reliabilityweb.com and The Association of Asset Management Professionals. Myself and HECO take no claim to the ownership of these - we are just thankful to reliabilityweb.com for providing this wonderful framework.
About the Author:
Justin T. Hatfield, CRL, CMRP
HECO - All Systems Go
About the author:
Justin T. Hatfield, CRL, CMRP is the President at HECO - All Systems Go. He is responsible for Electric Motor & Drive Sales, Electric Motor & Generator Repairs, Spare Solutions, and Predictive Services. Justin was instrumental in developing HECO MAPPS (Motor and Powertrain Performance Systems) which focuses on “why” you have a motor problem instead of simply “What” product or service should be recommended. Justin is a Certified Reliability Leader by The Association of Asset Management Professionals (AMP) and a Certified Maintenance and Reliability Professional (CMRP) by the Society of Maintenance and Reliability Professionals (SMRP). HECO is an EASA Accredited Service Center for Electric Motors as well as a provider of predictive maintenance & reliability services and products throughout the United States.