Stories from the Field…

People are always asking us for data and graphs showing examples of a catastrophic failure or load loss caused by defective batteries in their UPS systems.

People are always asking us for data and graphs showing examples of a catastrophic failure or load loss caused by defective batteries in their UPS systems. It is very difficult for us to support these requests because unless someone is ignoring their Cellwatch system alarms there simply are not any good examples to share. Our primary design focus is to ensure when the Cellwatch system is installed and operators are properly monitoring their batteries, catastrophic battery failures can be avoided.

However, we were able to eventually find a situation that produced very interesting battery data for everyone involved. It all began when we received a call from an installer explaining that a battery had failed over a weekend — this rarely occurs so we requested the Cellwatch data.

Following is the background information which helps fully illustrate the situation. This site had three small UPSs each with a 2 string, 16 jar battery. On the Friday before the disaster, the float voltage on two of the UPSs increased by about 7 volts. Most battery technicians dealing with high voltage batteries would consider an increase of 7 volts as negligible, but across 16, 12 volt jars, that’s almost half a volt increase in float voltage on each 12 volt jar. Clearly immediate attention should have been given to the matter. Our suggestion for a best practice is to investigate any alarm condition. Since this situation created an alarm, the best practice should have been to investigate the cause soon after the alarm was reported.

A quick look at the Cellwatch data showed the charge current rose to almost 13 amps, one of the jar float voltages rose to almost 16 volts over the weekend, and the temperature in the battery cabinet rose to 140 degrees Fahrenheit (60 degrees Celsius). Ultimately, the 12 volt jar failed without exploding or causing a fire, but the whole string was irreparably damaged and had to be replaced.

The Voltage graph clearly shows UPS 2, String 1, Jar 5 rising and accepting some of the over voltage beginning Friday, 4/11.

We’re sure many of you know what caused this. This battery had started into thermal runaway. Taking readings every day, Cellwatch had detected the problem long before risk of fire or runaway would occur. In fact, Cellwatch is used by companies required to meet National Electrical Code 608.3. Cellwatch is a listed device and can be configured to not only detect but prevent thermal runaway conditions.

The detailed history of the events revealed that the battery technician who was called out to the site assumed it was a battery problem and had planned to just change all 16 jars in the failed string of the UPS when he returned to work the following week. However, after NDSL’s battery expert looked at the data and saw that both UPS 2 and UPS 3 float voltage had increased on Friday, we suggested there was a more serious issue and he should try to find out why.

Now, the interesting aspect of the story is that the annual UPS maintenance visit had taken place on the Friday when the battery voltages began to rise, and the UPS technician had turned up the float voltage based on an indication of dropping voltages on the front panel LCD meter on two of the three UPSs. Both meters were reading 7 volts low according to a newly calibrated, hand held, DVM measuring across the battery breaker.

Thinking the charger was simply adjusted too low; the UPS technician had ‘bumped up’ the float, which could lead to catastrophic consequences. He had in fact exacerbated the problem by making the wrong assumption about the cause of the low battery voltage. It’s worth noting that the float voltage increase was instantly visible on the Cellwatch battery monitoring system which was fitted to all three UPS systems, and the one jar that started into thermal runaway, jar 5, triggered the alarm as early as Saturday. Action should have been taken at that time. Interestingly, it took the entire weekend for the current to reach its peak, and by Monday when UPS 2 was pumping almost 13 amps into this battery string, the temperature also reached its peak.

This is one of the rare events when the voltage will sound the alarm before ohmic value as the high voltage (and temperature) is actually destroying the charge holding capability of the battery. Had the charger voltage not been increased, it is highly likely that Cellwatch would have detected the failing jar long before it would have begun to heat up in the overcharged condition. However, this weakening cell was put under so much stress with the increased charger voltage that it accelerated the performance degradation.

(Note: The falling Ohmic Value graph above is a result of some of the channels of that DCM going so high they are off the scale of acceptable performance readings and are clipped by Cellwatch.)

Key takeaways from the story:

  • Discourage UPS maintenance on Fridays unless someone will be on-site to address any issues the following day.
  • Have the UPS technician check the calibration of the front panel instruments from time to time.
  • Make sure Cellwatch is installed and someone is looking at it – especially immediately following planned maintenance visits.

Alarms must have been going off all weekend with the temperature alarm being triggered sometime on Saturday and reaching its peak Sunday night/Monday morning.

It’s interesting to note how the current and temperature characteristics track each other closely as can be seen from the following charge current graph. Cellwatch does not trigger an alarm from high charge current but it is gratifying to note that all other alarms would have already been triggered by the their parameters before current became high.