Preventive Maintenance is Vital to Data Center Health

03/01/2008 |

When correctly implemented, preventive maintenance programs ensure maximum reliability of data center equipment

As organizations become increasingly dependent on data center systems, there is a need for greater reliability in the critical power system. For many organizations, the IT infrastructure has evolved into an interdependent, business-critical network that includes data, applications, storage, servers, and networking. A power failure at any point along the network can affect the entire operation and have serious consequences on the business.

To minimize unit-related failures, comprehensive preventive maintenance (PM) programs with OEM-trained and -certified technicians are recommended. When correctly implemented, PM programs ensure maximum reliability of data center equipment by providing systematic inspections that can lead to detection and correction of initial failures (either before they occur or before they develop into major defects that can result in costly downtime). Typical PM programs include inspections, tests, measurements, adjustments, parts replacement, and housekeeping practices.

To better address the importance of regular, skilled preventive maintenance, in-depth analyses have been conducted; bottom-line-driven organizations can use this information to help shape their PM policies and practices.

Research was based on the mean time between failures (MTBF) evaluation, which is an industry-recognized measure of system behavior. In general, a higher MTBF number, stated in hours, indicates a more reliable unit. The analysis began by tabulating data covering more than 185 million operating hours for more than 5,000 three-phase UPS units. The sample was broken into groups according to the number of PM visits written into the contract.

An increase in the number of annual preventive maintenance visits increases the mean time between failures (MTBF).

The outcome of the model can be seen in the figure shown above, which depicts the expected MTBF figures projected up to six PM events per year. The MTBF estimate for the "no PM" group is substantially lower than the observed MTBF for units with emergency-service-only contracts, but is in line with the lifespan of components that must be replaced.

There is a substantial increase in MTBF from zero to six PM visits per year. When projected further out than six PM visits, the MTBF begins to level off at around 19 PM visits per year and then slowly declines at higher levels of maintenance. This decline after a large number of PM visits can be attributed to the fact that every service event introduces the possibility of service-related human error.

The study also found that UPS reliability increases when a factory-trained and -certified engineer handles the equipment (effective service engineers are trained on new procedures, equipment, designs, and changes to systems).

The number of preventive maintenance visits and the service engineer's level of training both have substantial impact on system reliability. Research further supports the need for at least two PM visits per year, but also makes the case for more PM visits for facilities where downtime is unacceptable. Depending on the cost of downtime for a particular application, a high return on investment can be realized in many cases by increasing PM frequency.

Jeff Powers is a senior product manager at Liebert Service, a part of Emerson Network Power, Columbus, OH.

Related Coverage