Moving from savings to revenue

Can IIoT shift the focus towards uptime?

Many plants fail to prioritise measuring the costs of downtime, which are considered to be complex and are often underestimated. While downtime costs can be very complex, investing the time and energy in understanding the true cost of unscheduled downtime can be a catalyst for better practices.

Benjamin Franklin’s statement that “In this world, nothing can be said to be certain, except death and taxes” was likely made before the Industrial Revolution. Today, in any industrial plant there is a third area of certainty: unplanned machine downtime.

As a society, downtime incurs financial and environmental losses. The cost to process industries alone is $20 billion a year. At a global level, the impact includes lost production, environmental damage, and unsafe work environments. Given the magnitude of damage, one would expect reducing downtime to be a corporate priority. Yet, in the industrial arena, its existence is tolerated, and most industrial plants do not measure the dollar cost associated with it.

With the adoption of Industry 4.0, industrial plants are digitalising their physical assets. In our opinion, the rise of Industrial IoT or IIoT can significantly alter the reliability and maintenance discipline. In fact, as industrial plants apply Big Data and Machine Learning to their operating environments, reducing the considerable dollars wasted by machine downtime is finally achievable.

The state of machine downtime measurement today

Let’s start with some troubling statistics about machine downtime that should be keeping every plant manager up at night. According to the American Productivity and Quality Center (APQC), unplanned machine downtime accounts for 4% of scheduled run time for those industrial plants in the Median category. For bottom performers, unplanned downtime accounts for 6% of scheduled run time. If we translate this into revenue terms, out of every twenty dollars, between $0.40 and $1.20 is lost due to unplanned downtime.

As troubling as the numbers above are, experts Dave Crumrine and Doug Post claim that 80% of industrial plants cannot estimate downtime with accuracy. Furthermore, in many facilities downtime is underestimated by 200-300%. If these numbers are accurate, then it is quite possible that in some industrial plants four out of twenty dollars is lost due to downtime.

Downtime cost 101: how to quantify unscheduled downtime.

The formula to calculate the dollar cost and production lost by downtime is simple. Here is a basic formula used for True Downtime Cost:

Downtime Dollars ={Tangible Cost (Lost Revenue + Labor Costs + Cost of Excess Capacity) + Intangible Costs}

Tangible costs are broken down into their individual components:

Lost revenue. The opportunity cost of lost production. If 10 units are produced in an hour, and the revenue for each unit is 10 dollars, then the lost revenue is 100 dollars.

Labor costs. When machinery is inoperable, plant workers are unproductive and cannot be reassigned during waiting periods. The labor cost per employee is included in true downtime cost.

Another item in labor cost is the repair cost premium. When downtime is unscheduled, repair crews may need to leave other job sites and reprioritise their tasks. Spare parts and other equipment may also need to be ordered. The time wasted on waiting for repair crews and parts to arrive adds a further cost element to the true downtime cost.

Cost of excess capacity. Industrial plants need to carry excess production capacity in case of spikes in demand. Unscheduled downtime reduces capacity and necessitates additional extra capacity. The dollar cost of this additional capacity is a line item included in true downtime cost.

Intangible costs are broken down as follows:

  • Stress. Although a dollar figure cannot be assigned to stress, unplanned Downtime is disruptive to the overall production plant. In crisis mode, every minute becomes important and employees rush to identify the cause of the failure and to restore production. This takes on a toll on morale and ultimately productivity.
  • Innovation. When employees are assigned to stressful tasks of restoring failed production line, this limits their ability to add value and come up with creative ideas.

Practitioner’s feedback:  How is downtime measured? Do industrial plants measure true downtime costs (including labor cost, excess capacity, and intangibles) or do they use revenue lost?

Steve Borris, Author

Cost is relatively straightforward.

I have often used it as a means to persuade companies to buy proper spares and turn-around parts … to say nothing of semi-skilled maintenance workshops.

However, production loss is only a component – if the lost production cannot be made by running at a different time and all orders still being met. Indeed, it is one of the few times I have found Utilization to be a helpful measure.

I have found when analyzing the reasons for downtime and the work done to repair, that logged reports are not as reliable as we might expect. I am basing that on a number of companies across the UK and Europe that have spent oodles of cash on “monitoring” systems.

For me, the best “data” is found by a “Zero Fails” type team doing regular analyses of downtimes – including, for example, repeat issues and PM’s that don’t do what they should… basically a proper TPM system.

OEE is probably the best KPI if used properly.

Michael Johnson, Maintenance Supervisor

There are two metrics that should be obvious. The first one is the loss of production and the second would be the loss of opportunity to produce.

Losing capacity for a period of time doesn’t necessarily mean your company lost money. Downtime costs will not show a negative on the books if there was additional equipment in place to counter the downed machine. Even with no or low costs associated with a particular downtime event, the data collected is a gold mine of opportunities to analyze the failure to the resolution.

Losing production, on the other hand, is expensive training. One would think it’s pretty easy to gather up a money value of the days losses by counting labor, waste, and loss of shipment to customers, right? Not always. One thing you will never really know is all the nickel and dime losses such as reputation and loss of future business.

The one way you can focus on both scenarios is measuring OEE, TAKT time, and speed loss. Don’t forget your rework and package waste! Together these metrics will give you a better understanding of your production line’s true losses and not some good guess.

Alfredo Manuel Láttero, Consultant Engineer

Lagging indicators AND leading indicators are both necessary in typical organizations. I agree with the idea of measuring losses as well as drilling down to search for the root causes as a reactive action. Because downtime only, in a variable flow process, can be misleading. Here is where OEE enters in the scene as a more complete indicator.

Where to measure OEE is a key issue: As a KPI, I advocate using it mainly in the bottlenecks.

And, yes, IIoT will be helpful, particularly in those situations where extended distances are involved.

But leading indicators (PM or PdM accomplishment – assuming plans were optimized and updated, % of reactive maintenance, etc.) are essentials in preventing lagging indicators deterioration. In the root causes finding a process, extension of causes and conditions are also proactive actions even more effective than leading indicators.

Anthony Mallette, Millwright

Calculating gross (or fixed) uptime is straightforward. Start with the “perfect scenario” that is every moment the equipment is powered up, it is producing. Be realistic: 8 hours is actually 7 hours after breaks etc.

Then subtract your manufacturers recommended maintenance downtime and your cleaning time and you have an impossible ‘fixed’ number of uptime hours.

Hypothetically, 7×200 days – 25 hours maintaining and 200 hours cleaning. That’s 1,175 hours of production per year on 1 shift.

What can you do with that number? If your changeovers are 3 hours each and you do 1 every 5 days, then you lose 120 hrs. Now you have 1,055 hours, although Continuous Improvement strategies can trim that.

Take your year of actual uptime and compare to the gross/fixed and make a plan to get closer to it.

Robin Wavite, CMRP

I have been involved in measuring and conducting investigations into causes of Downtime and it’s specifically related to delayed production thus called delay accounting/revenue loss.

The availability metric (uptime) is usually centered around Tier 1 (production stopper) assets.

For example, in our industry for every 1 hour that one mill is down translating to US$75,000 production delay.

Why is downtime not measured today?

There is a good reason that many companies are not measuring downtime. The formula for true downtime is simple and logical. However, applying the formula is almost impossible when the data is not available.

Let us put aside the intangible measurements such as stress and innovation. Although these are important considerations, from a practical perspective there is no way to assign dollar values and they are not calculated.

Although average labor costs can be calculated, the specific time wasted before and during machinery repair can only be measured if plant employees record whether time was wasted, or whether they were reassigned to other tasks. In the extreme of an entire facility shut down, it may be easier to calculate, but it also requires internal systems to log workers’ non-productive hours.

Similarly, repair cost premium can only be measured if maintenance employees record their time allocated to downtime.

The critical element of true downtime cost is lost revenue or opportunity cost. Here are some of the reasons that assigning a dollar value to a reduction in revenue is impractical:

  • Revenue is not tracked at a plant level. Production output is measured, but revenue occurs at a later stage in the value chain.
  • Calculating lost production is complex. Although time can be measured, figuring how many widgets were not produced during a specific time-period is very difficult. If the calculation is based on a measurement of the actual production lost during machine stoppage, this data would need to be recorded. Production plants simply do not track product-level data relating to lost production in a heterogeneous manufacturing environment.
  • The use of averages can result in miscalculations. In the extreme, if oil refinery is completely shut down, it is possible to calculate average processing rates and then multiply these rates by Downtime. When a plant produces a commodity item with a known market price (energy, chemicals etc.), it is relatively easy to calculate. However, in the case of a manufacturing facility with multiple product lines, one would need access to production level manufacturing data and cannot rely on averages.

Practitioner’s perspective: downtime versus process reliability

Mike Spence, Senior Reliability Engineer

OEE and/or Process Reliability are the high-level metrics that put downtime into perspective. The difference is that downtime/availability is an input to OEE, whereas it is an output from Process Reliability. I prefer Process Reliability – and the IIoT has a lot of potential to support Process Reliability analysis at lower levels in the functional hierarchy, rather than just at the production system level.

Why survey data cannot be used to quantify the dollar cost of downtime

Due to a lack of accurate data, an alternative to using internal data for calculations is to use survey data for estimates. Let us look at a widely quoted study from the automotive industry. In 2006, 101 automobile executives were interviewed and asked to calculate the average cost of lost production. The average number was $22,000 per minute, but there were estimates as high as $50,000. Twelve years later, this figure is still in use even though it was an estimate made without a statistical basis.

Why can we not trust this $22,000 estimate? First, “downtime” is not defined and there are multiple definitions of downtime. To what extent are non-tangibles included in the calculation? How is revenue calculated and does the $22,000 take into consideration estimates other profitability calculations? Does the $22,000 refer to complete shutdown of an average size plant or partial shutdown?

Without an understanding of the research methodology, we do not know the answer to these questions and it is likely that the $22,000 is inaccurate. When a production plant bases its estimates of true downtime cost on third-party research data that lacks rigorous standards, this results in misleading if not meaningless calculations.

Should uptime replace downtime?

Technically, Uptime is the inverse of Downtime. However, when the executives shift their focus to Uptime, it reflects a change in how they view Maintenance.

Practitioner’s perspective

Eric Acosta, Grupo Wow

Focus on the downtime is a “negative” approach to increase efficiency. Sometimes it is better to challenge the team trying to increase the uptime, but of course, you still have to measure the downtime … We have sensors on each machine connected to a “real-time server”, so we can measure the uptime and the speed. This is linked to our ERP, so scheduling also updates in real time.

In the past, limiting downtime was seen an exercise in cost reduction and viewed as an operational issue. It was not a corporate priority. Had reducing the dollars wasted on downtime been a C-level priority, better measurement systems and tools would be in place.

Industrial IoT is now a reality and companies recognize that within the terabytes of data generated from industrial plant sensors are hidden micropatterns that can warn us of emerging asset failure. For instance, in the case of SKF Enlight AI, we use Automated Machine Learning to detect abnormal sensor data. We can alert plant technicians that a piece of equipment is showing evidence of degradation so that it can be fixed prior to unscheduled breakdown. This way repairs can be scheduled, and parts ordered while a machine is still operational. Even if production loads need to be lowered temporarily, expensive machine or plant shutdown can be avoided.

Practitioner’s perspective

Clive Moore, Managing Director, M2H Pty Ltd

I have heard from CFO’s many times in my career that maintenance is a cost to their business, the alternative are those organisations that actually reverse this view and understand that good maintenance is ‘insurance’ like all insurances any loss has to be visible and accounted for. Leading companies follow the money. That is to say they: actually see failure of plant on the internal balance sheet. If you don’t account for the impact on earned value from plant losses, then companies are not incentivized to improve.

Summary and conclusion

A paper delivered at First World Congress on Engineering Asset Management (WCEAM) held in Australia explains that premiums Boiler and Maintenance Insurance (BM) and Business Interruption Insurance (BI) can be affected by a company’s asset maintenance. The “insurance industry is vitally interested in technologies which reduce the risk interruption to plant and processes” as these impact risk profiling and cost of capital.

Over time we expect the focus to shift to uptime from downtime because C-level executives as recognize the bottom-line potential from eliminating lost production. Since downtime has never been widely or properly measured, the dawn of Industrial IoT provides an opportunity to replace KPI relating to lost dollars to those relating to revenue gained.

Special Note: We posted a question about Downtime on the Association of Asset Management Professionals – AMP LinkedIn Group and received much of the feedback in this article. We are grateful to all those people who provided insights and quotes that were used in writing this article.

SKF Enlight AI

Industrial plants generate terabytes of process data. SKF Enlight AI is a SaaS Predictive Maintenance solution that uses Automated Machine Learning to identify emerging asset failure patterns within this data. It provides early warnings and sensor-level intelligence to help avert unplanned downtime and meet production goals. For more information on how SKF Enlight AI can improve performance and reliability, click here.

Leave a Reply

Your email address will not be published.