|
From: Folkert Rienstra on 17 Apr 2008 18:54 Franc Zabkar wrote in news:gt9b045tqpk3gbj6i28lrqprncec62bge5(a)4ax.com > I've been reading this document which is an analysis of Google's hard > disc failure rates: > > Failure Trends in a Large Disk Drive Population: > http://research.google.com/archive/disk_failures.pdf > > It states that "contrary to previously reported results, we found very > little correlation between failure rates and either elevated > temperature or activity levels." > > Figure 4 "shows that failures do not increase when the average > temperature increases. In fact, there is a clear trend showing that > lower temperatures are associated with higher failure rates. Only at > very high temperatures is there a slight reversal of this trend." > > "Figure 5 looks at the average temperatures for different age groups. > The distributions are in sync with Figure 4 showing a mostly flat > failure rate at mid-range temperatures and a modest increase at the > low end of the temperature distribution. > What stands out are the 3 and 4-year old drives, where the trend for > higher failures with higher temperature is > much more constant Presumably they mean the bathtub figures look like copies of each other. > and also more pronounced." What I find much more interesting is the trend reversal from 3rd to 4th year, while maintaining equal relation between AFR and temperature ranges. Presumably the weaker brothers fall out of the mix and the rest just lives on happily. > > "Overall our experiments can confirm previously reported temperature > effects only for the high end of our temperature range and especially > for older drives. In the lower and middle temperature ranges, higher > temperatures are not associated with higher failure rates." > > Figure 5 suggests that Google's optimum temperature for hard drives is > between 35C and 40C. > > Elsewhere I found this old IBM article: > http://web.archive.org/web/20000519230551/http://www.storage.ibm.com/hardsoft/diskdrdl/technolo/drivetemp/drivetemp.htm > > It states that "figure 2 shows the dramatic effect that temperature > has on the overall > *reliability* > of a hard disk drive. Derivations [sic] from a nominal operating > temperature (assumed to be maintained over the life of a drive) > can result in a derivation [sic] from the nominal > failure *rate*. Hey, there is that favourite word of yours again. > As the temperature exceeds the recommended level, the > failure rate increases two to three percent for every one degree rise > above it. For example, a hard disk drive running for an extended > period of time at five degrees above the recommended temperature can > experience an increase in > failure *rate* And again. > of 10 to 15 percent. > Likewise, operating a drive below the recommended temperature can extend > drive life." > > This last statement is a bit ambiguous. If a hard drive is more reliable > at a temperature below that which is recommended, then why not > recommend a lower temperature in the first place? Then again, maybe > the author's intended meaning was "recommended maximum temperature". > > - Franc Zabkar
From: lars on 20 Apr 2008 08:16 In short, time well spend reading. http://www.pdl.cmu.edu/PDL-FTP/Failure/failure-fast07_abs.html
From: Franc Zabkar on 20 Apr 2008 17:50 On Sun, 20 Apr 2008 14:16:28 +0200, lars <lars(a)hesdorf.dk> put finger to keyboard and composed: >In short, time well spend reading. >http://www.pdl.cmu.edu/PDL-FTP/Failure/failure-fast07_abs.html This document appears to be a statistical analysis of HD failures. It doesn't attempt to delve into the technical reasons for failure. The only time it discusses temperature, or SMART, is in reference to the Google article in my OP. Google's experience suggests to me that temperatures below about 35C result in greater failure rates, which is contrary to normal expectations. However, Arno appears to be saying that the lower temps may be a consequence of failure rather than a cause. - Franc Zabkar -- Please remove one 'i' from my address when replying by email.
From: Arno Wagner on 20 Apr 2008 18:03 Previously Franc Zabkar <fzabkar(a)iinternode.on.net> wrote: > On Sun, 20 Apr 2008 14:16:28 +0200, lars <lars(a)hesdorf.dk> put finger > to keyboard and composed: >>In short, time well spend reading. >>http://www.pdl.cmu.edu/PDL-FTP/Failure/failure-fast07_abs.html > This document appears to be a statistical analysis of HD failures. It > doesn't attempt to delve into the technical reasons for failure. The > only time it discusses temperature, or SMART, is in reference to the > Google article in my OP. > Google's experience suggests to me that temperatures below about 35C > result in greater failure rates, which is contrary to normal > expectations. However, Arno appears to be saying that the lower temps > may be a consequence of failure rather than a cause. Exactly. It is possible, but the paper does not give us enough data to determine whether it is the case. Also it runns contrary to all known reliability characteristics of semiconductors, other electronics components and mechnanics. Arno
From: Franc Zabkar on 21 Apr 2008 01:09
On 20 Apr 2008 22:03:24 GMT, Arno Wagner <me(a)privacy.net> put finger to keyboard and composed: >Previously Franc Zabkar <fzabkar(a)iinternode.on.net> wrote: >> On Sun, 20 Apr 2008 14:16:28 +0200, lars <lars(a)hesdorf.dk> put finger >> to keyboard and composed: > >>>In short, time well spend reading. >>>http://www.pdl.cmu.edu/PDL-FTP/Failure/failure-fast07_abs.html > >> This document appears to be a statistical analysis of HD failures. It >> doesn't attempt to delve into the technical reasons for failure. The >> only time it discusses temperature, or SMART, is in reference to the >> Google article in my OP. > >> Google's experience suggests to me that temperatures below about 35C >> result in greater failure rates, which is contrary to normal >> expectations. However, Arno appears to be saying that the lower temps >> may be a consequence of failure rather than a cause. > >Exactly. It is possible, but the paper does not give us enough >data to determine whether it is the case. Also it runns contrary >to all known reliability characteristics of semiconductors, >other electronics components and mechnanics. > >Arno What about fluid dynamics? Maybe there is an optimal temperature for the platter lubricant and/or air bearing. I found this interesting Samsung patent whose inventors claim that "flying height drops significantly in humid conditions" and that this can be remedied "by increasing the temperature of the air flowing between a slider's air bearing surface and the rotating disk surface it accesses". Method and Apparatus Reducing Flying Height Drop in a Hard Disk Drive Under Humid Conditions: http://tinyurl.com/4s5brl http://www.freshpatents.com/Method-and-apparatus-reducing-flying-height-drop-in-a-hard-disk-drive-under-humid-conditions-dt20071227ptan20070297085.php - Franc Zabkar -- Please remove one 'i' from my address when replying by email. |