Mean Time Between Failure (MTBF) is a key measurement that helps calculate the average run time between a failure, which helps to inform the reliability of an asset.

Our latest article on Reliability 101 covers, MTBF: how to calculate it, improve it, and use this data to develop metrics for establishing KPIs.  

Key Takeaways

  • Mean Time Between Failure (MTBF) measures the likelihood of an equipment or component failure within a time frame.
  • A high MTBF can mean fewer problems and costs for your equipment, a lower one could mean more frequent failures and more expenses.
  • MTBF can be improved through process improvements, data standardization, and by identifying the root cause of failures.
  • MTBF can be used to inform KPIs, including budgeting and capex investment decisions.
  • There are challenges in capturing a clean MTBF calculation, including how a “failure” is defined

Reliability Metrics in context

Reliability metrics provide operations management with valuable information about the performance of various aspects of an operation. They are a means for comparing current site practices against industry standards and help to find areas where an organization can improve processes to improve operational efficiency.  

Having a complete picture of the reliability and potential failure modes of an asset is conducive to making decisions that contribute to your business strategy, capital investment projects, product development plans, and operational policies. 

A commonly used metric used to measure asset reliability is using its Failure Metrics, which include Mean Time Between Failure (MTBF), Mean Time to Failure (MTTF) and Mean Time to Repair (MTTR). 

This article provides an overview of MTBF and outlines how to use this knowledge to calculate, improve, and use these metrics for building KPIs.  

What is Mean Time Between Failure (MTBF)?

Mean Time Between Failure (MTBF) measures the likelihood of an equipment or component failure within a period, which allows one to quantify its reliability. 

graphic illustrating reliability failure metrics, including MTTF, MTFB, MTTR

Visualizing Failure Metrics

From many businesses, knowing your MTBF is useful in assessing the reliability of the systems that support your business operations, as the goal for many businesses is to maximize output and minimize downtime.  

How to calculate MTBF 

The following formula is used to calculate MTBF: 

Total Operational Time / Total Number of Breakdowns = MTBF (hrs.)

Total operational time = total amount of time during which the device has been running without any breakdowns. Includes both planned maintenance events and unplanned repairs.

Total number of breakdowns = total number of times the equipment has failed while running. May include any type of failure including mechanical, electrical, software, or human error. 

What the results mean  

A high MTBF output means fewer problems with your equipment will occur over its lifetime. This translates into lower costs associated with repairs and downtime.  

A lower MTBF output means you are likely to experience more frequent failures. You should plan accordingly so that when one does happen, you can respond with the correct asset management strategy in place. 

How to Improve MTBF

There are many factors that affect your MTBF calculation, such as age, operating conditions, and usage patterns. However, the calculation can be improved by reviewing the systems in place and making the appropriate changes.

Process Improvements

One method to improve MTBF is through process improvement. Processes like continuous monitoring, preventive maintenance, and regular testing can ensure that an asset still is reliable throughout its lifecycle.

Preventive Maintenance

Preventive maintenance can help to avoid costly downtime by reducing the risk of failure and increasing the reliability of an asset, thereby improving its MTBF.

Reliability Centered Maintenance, or RCM, is a reliability methodology that develops asset management strategies that help ensure an equipment is available when needed. It includes examining how parts wear out over time, understanding why they fail, and developing preventive measures for each part type. RCM also gathers information about which assets may require more attention than others to develop a plan of action.

Data Standardization

Standardizing data provides reliable, accurate, and timely information on an asset’s performance to improve operational and strategic decision making. It is critical for an asset care strategy. Data standardization allows you to compare assets across multiple sites, locations, and time periods. This enables organizations to make more informed decisions about your overall asset portfolio. 

Data standardization also helps improve the communication between different departments within an organization. If all employees use standardized data, they understand how their work relates to others’ efforts and they can easily see where gaps exist and collaborate to close them. 

Identify the cause of failures

When a machine does experience failure, an RCA can be useful for finding the root of failures and developing solutions to prevent them from happening again. The Five Whys is an effective method for discovering the root cause of problems and developing long-term solutions to prevent them from recurring. 

How MTBF informs KPIs

Since MTBF is a measure of a system’s reliability, it can be used in a variety of important business decisions, including KPIs, which are useful to know how well your company is doing and to make decisions about how to improve future performance.  


MTBF helps maintenance make informed decisions by giving a quantitative estimate of when and how often an asset is expected to fail before it does. This helps with budgeting for a replacement/upgrade as it helps determine when it will reach its end of life and need replacing. 

Prioritizing Maintenance Activities

MTBF can help decide the right timeframe to schedule downtime for maintenance activities. Combined with a proactive maintenance approach, it can also be used to prioritize activities based on criticality. Further, it can provide a benchmark against which to measure progress.

For example, if the MTBF of an asset has improved due to optimized maintenance activities, it provides a measurable progress milestone. 

Capex Investment Decisions  

Capex investments are made based on expected future revenue streams. Knowing the failure of a new asset will help to get a clearer picture of its TCO and ROI.

This knowledge helps to figure out whether the investment is worth it because there could be another option that provides greater ROI. Further, knowing the cost per hour of operation helps inform decisions regarding whether to replace aging equipment. 

Quality Assurance in Manufacturing  

MTBF is used as a measure of quality assurance in manufacturing processes.

For example, if a component fails after only three months, this may indicate inadequate quality control during manufacturing. It could mean that there was something wrong with the design process, materials were not up to spec, or some other problem occurred at the factory.

The same thing applies to components that have been operating well but suddenly start failing. 

Challenges in capturing MTBF

To find the true calculation for MTBF, you must account for variances which can affect the quality of the data and potential compromise the validity of the information.

Variances in data collection

The MTBF rate can vary depending on the equipment that is being measured. In addition, the MTBF rates will change based on the environment in which the equipment operates. 

Poor Data Tracking 

To measure the effectiveness of these strategies, we need reliable data when things go wrong so we know what is going right. This requires tracking all types of breakdowns, not just those caused by hardware issues such as broken parts or worn-out bearings. 

It also includes incidents involving people making mistakes like forgetting to turn off machinery or not following safety procedures. 

These kinds of errors may seem minor, but they add up over time if you do not keep careful records. 

Incomplete records

If a company does not keep track of its history, then it cannot figure out whether a particular part has had any issues before. 

The current procedures in-place may only keep track of major events such as repairs, replacements, etc., which means that the actual number of incidents occurring over a period cannot be accurately determined. 

 Varying definitions of “Failure”

The definition of failure is open to interpretation. Some companies define failure as any incident that results in lost production, while others consider anything less than 100% uptime acceptable.

In addition, many manufacturers choose to exclude certain categories of events from being considered part of the total number of failures because they believe that they do not affect the reliability of the product.

To calculate the true value of an asset, you must include every type of event that affects its availability.

Going further than reliability metrics

With the many advantages of knowing the MTBF of your assets also comes challenges. One common challenge is knowing that the data that lives within your organization is clean and can drive the entire organization to make the right decisions that can impact your bottom-line.

At MaxGrip, we help asset-intensive organizations improve their maintenance and reliability practices by providing them with the blueprint to connect the dots between metrics and tangible business outcomes.

What does that produce?

More effective decision-making, more productivity, and, most of all, more profitability.

Learn more about our data-driven approach to boost your asset performance here.

Share this article:

Get inspired

How to Use RCM to Drive Your Bottom Line

Learn about RCM, how it can streamline your maintenance operations, and why it’s a powerful strategy to lower maintenance and reliability costs.

APM business case

Webinar in partnership with UReason and Ultimo. The webinar focuses on technology, processes and organization to kickstart asset performance improvements.

Maintenance budget OEE

We explain why it is not the right strategy to cut maintenance cost if you want a sustainable strategy with lasting results for your OEE.