From: Franc Zabkar on
On Sun, 04 Oct 2009 17:10:07 -0700, Ant <ant(a)zimage.comANT> put finger
to keyboard and composed:

>Vendor Specific SMART Attributes with Thresholds:
>ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
>UPDATED WHEN_FAILED RAW_VALUE

> 5 Reallocated_Sector_Ct 0x0033 252 252 063 Pre-fail
>Always - 9

>196 Reallocated_Event_Count 0x0008 251 251 000 Old_age
>Offline - 2

>197 Current_Pending_Sector 0x0008 253 253 000 Old_age
>Offline - 2

>198 Offline_Uncorrectable 0x0008 251 251 000 Old_age
>Offline - 2

>201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always
> - 8

The above attributes indicate read problems.

>SMART Error Log Version: 1

> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 51 01 83 a2 68 e2 Error: UNC 1 sectors at LBA = 0x0268a283 = 40411779

> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 51 06 67 a2 68 e2 Error: UNC 6 sectors at LBA = 0x0268a267 = 40411751

I'd identify the file(s) that occupies sector # 40411779 and sectors
40411751 through 40411757. You can then delete the file and replace it
with a backup copy. This should free up those sectors/clusters
occupied by the file. The next time Windows tries to write to the bad
sectors, the drive will retest them and then most likely remap them.
If you don't have a backup, then you can try to recover as much of the
file as possible using Bad Block Copy (see below).

The Win2K OEM support tools contain a utility named nfi.exe which
identifies the file that occupies a particular sector:

http://download.microsoft.com/download/win2000srv/utility/3.0/nt45/en-us/oem3sr2.zip

Nfi.exe also works on Win XP and 2003.

See these articles for more information:

http://sourceforge.net/mailarchive/message.php?msg_name=1MsZg8-0catai0%40fwd07.aul.t-online.de

http://groups.google.com/group/microsoft.public.win2000.file_system/browse_thread/thread/7cd6bbd5fade6590/

http://support.microsoft.com/kb/253066/en-us

The Linux instructions are here:

http://smartmontools.sourceforge.net/badblockhowto.html

Bad Block Copy is a Windows command-line tool that recovers data from
damaged media:

http://alter.org.ua/en/soft/win/bb_recover/

"Copies file ignoring Bad Blocks. If target file doesn't exist,
instead of unread blocks, ZEROs are written. If target file exists,
its blocks, corresponding to Bad Blocks in source are not touched.
Thus, if you have some copies of the same file with Bad Blocks in
different places, it is possible to completely restore the original
file. To do this you should run bbcopy.exe with same target, but
different sources."

You can also fill unreadable parts with '** BAD BLOCK ***.

- Franc Zabkar
--
Please remove one 'i' from my address when replying by email.
From: Rod Speed on
Ant wrote
> Rod Speed wrote

>>> SMART results look bad for those errors.

>> Particularly the reallocated sectors.

>>> SMART said passed when I did a quick check.

>> I ignore that stuff and concentrate on the raw values.

> Hence, the quickie. ;)

That was a comment on your 'said passed'

>>> I wonder if that is why Windows was feeling sluggish lately.

>> Likely retrying on the pending sectors.

> Does that mean a dying HDD

The number of reallocated and pending is getting up quite high.

> and needs to replace right away?

I would, since its a Maxtor. They can go pear
shaped very quickly once they start to die.

> Here's the updated SMART results after I did two chkdsks and a smartctl full test request:

> smartctl version 5.38 [i686-mingw32-2000-sp4] Copyright (C) 2002-8
> Bruce Allen
> Home page is http://smartmontools.sourceforge.net/
>
> === START OF INFORMATION SECTION ===
> Model Family: Maxtor DiamondMax Plus 8 family
> Device Model: Maxtor 6E040L0
> Serial Number: E155KPHE
> Firmware Version: NAR61590
> User Capacity: 41,110,142,976 bytes
> Device is: In smartctl database [for details use: -P show]
> ATA Version is: 7
> ATA Standard is: ATA/ATAPI-7 T13 1532D revision 0
> Local Time is: Sun Oct 04 16:43:43 2009 PDT
> SMART support is: Available - device has SMART capability.
> Enabled status cached by OS, trying SMART RETURN
> STATUS cmd.
> SMART support is: Enabled
>
> === START OF READ SMART DATA SECTION ===
> SMART overall-health self-assessment test result: PASSED
>
> General SMART Values:
> Offline data collection status: (0x82) Offline data collection
> activity was completed without error.
> Auto Offline Data Collection: Enabled.
> Self-test execution status: ( 116) The previous self-test
> completed having
> the read element of the test failed.
> Total time to complete Offline
> data collection: (1021) seconds.
> Offline data collection
> capabilities: (0x5b) SMART execute Offline immediate.
> Auto Offline data collection on/off support.
> Suspend Offline collection upon new
> command.
> Offline surface scan supported.
> Self-test supported.
> No Conveyance Self-test supported.
> Selective Self-test supported.
> SMART capabilities: (0x0003) Saves SMART data before
> entering power-saving mode.
> Supports SMART auto save timer.
> Error logging capability: (0x01) Error logging supported.
> No General Purpose Logging support.
> Short self-test routine
> recommended polling time: ( 2) minutes.
> Extended self-test routine
> recommended polling time: ( 17) minutes.
>
> SMART Attributes Data Structure revision number: 16
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE
> 3 Spin_Up_Time 0x0027 227 221 063 Pre-fail
> Always - 5662
> 4 Start_Stop_Count 0x0032 253 253 000 Old_age
> Always - 1252
> 5 Reallocated_Sector_Ct 0x0033 252 252 063 Pre-fail
> Always - 9

That hasnt changed and is the most important one.

> 6 Read_Channel_Margin 0x0001 253 253 100 Pre-fail
> Offline - 0
> 7 Seek_Error_Rate 0x000a 253 252 000 Old_age
> Always - 0
> 8 Seek_Time_Performance 0x0027 246 240 187 Pre-fail
> Always - 41311
> 9 Power_On_Minutes 0x0032 242 242 000 Old_age
> Always - 865h+43m
> 10 Spin_Retry_Count 0x002b 253 252 157 Pre-fail
> Always - 0
> 11 Calibration_Retry_Count 0x002b 253 252 223 Pre-fail
> Always - 0
> 12 Power_Cycle_Count 0x0032 250 250 000 Old_age
> Always - 1251
> 192 Power-Off_Retract_Count 0x0032 252 252 000 Old_age Always - 1248
> 193 Load_Cycle_Count 0x0032 253 253 000 Old_age Always - 2746
> 194 Temperature_Celsius 0x0032 253 253 000 Old_age Always - 26

Thats quite acceptible, Maxtors dont like running hot.

> 195 Hardware_ECC_Recovered 0x000a 253 252 000 Old_age Always - 8300
> 196 Reallocated_Event_Count 0x0008 251 251 000 Old_age
> Offline - 2
> 197 Current_Pending_Sector 0x0008 253 253 000 Old_age
> Offline - 2

That hasnt changed either.

> 198 Offline_Uncorrectable 0x0008 251 251 000 Old_age
> Offline - 2

That one has gained an extra one.

> 199 UDMA_CRC_Error_Count 0x0008 199 199 000 Old_age
> Offline - 0
> 200 Multi_Zone_Error_Rate 0x000a 253 252 000 Old_age Always - 0
> 201 Soft_Read_Error_Rate 0x000a 253 252 000 Old_age Always - 8
> 202 TA_Increase_Count 0x000a 253 251 000 Old_age Always - 0
> 203 Run_Out_Cancel 0x000b 253 252 180 Pre-fail Always - 29
> 204 Shock_Count_Write_Opern 0x000a 253 251 000 Old_age Always - 0
> 205 Shock_Rate_Write_Opern 0x000a 253 252 000 Old_age Always - 0
> 207 Spin_High_Current 0x002a 253 252 000 Old_age Always - 0
> 208 Spin_Buzz 0x002a 253 252 000 Old_age Always - 0
> 209 Offline_Seek_Performnce 0x0024 188 187 000 Old_age
> Offline - 0
> 99 Unknown_Attribute 0x0004 253 253 000 Old_age
> Offline - 0
> 100 Unknown_Attribute 0x0004 253 253 000 Old_age
> Offline - 0
> 101 Unknown_Attribute 0x0004 253 253 000 Old_age
> Offline - 0

> SMART Error Log Version: 1
> Warning: ATA error count 19 inconsistent with error log pointer 5
>
> ATA Error Count: 19 (device log contains only the most recent five
> errors) CR = Command Register [HEX]
> FR = Features Register [HEX]
> SC = Sector Count Register [HEX]
> SN = Sector Number Register [HEX]
> CL = Cylinder Low Register [HEX]
> CH = Cylinder High Register [HEX]
> DH = Device/Head Register [HEX]
> DC = Device Command Register [HEX]
> ER = Error register [HEX]
> ST = Status register [HEX]
> Powered_Up_Time is measured from power on, and printed as
> DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
> SS=sec, and sss=millisec. It "wraps" after 49.710 days.
>
> Error 19 occurred at disk power-on lifetime: 3881 hours (161 days + 17
> hours)
> When the command that caused the error occurred, the device was in
> an unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 51 01 83 a2 68 e2 Error: UNC 1 sectors at LBA = 0x0268a283 =
> 40411779
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> c8 00 01 83 a2 68 e2 00 02:14:10.592 READ DMA
> c8 00 01 82 a2 68 e2 00 02:14:10.592 READ DMA
> c8 00 01 81 a2 68 e2 00 02:14:10.592 READ DMA
> c8 00 01 80 a2 68 e2 00 02:14:10.592 READ DMA
> c8 00 01 7f a2 68 e2 00 02:14:10.592 READ DMA
>
> Error 18 occurred at disk power-on lifetime: 3881 hours (161 days + 17
> hours)
> When the command that caused the error occurred, the device was in
> an unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 51 06 67 a2 68 e2 Error: UNC 6 sectors at LBA = 0x0268a267 =
> 40411751
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> c8 00 22 67 a2 68 e2 00 02:14:09.264 READ DMA
> c8 00 1e 13 ee 30 e1 00 02:14:09.248 READ DMA
> c8 00 1f f1 15 30 e1 00 02:14:09.248 READ DMA
> c8 00 1e 2f a8 27 e1 00 02:14:09.232 READ DMA
> c8 00 1f da e7 23 e1 00 02:14:09.232 READ DMA
>
> Error 17 occurred at disk power-on lifetime: 3871 hours (161 days + 7
> hours) When the command that caused the error occurred, the device
> was in an unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 51 06 67 a2 68 e2 Error: UNC 6 sectors at LBA = 0x0268a267 =
> 40411751
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> c8 00 22 67 a2 68 e2 00 04:55:32.656 READ DMA
> c8 00 80 81 02 ab e1 00 04:55:32.656 READ DMA
> c8 00 08 09 02 ab e1 00 04:55:32.640 READ DMA
> c8 00 1e 13 ee 30 e1 00 04:55:32.640 READ DMA
> c8 00 08 c5 a3 11 e1 00 04:55:32.624 READ DMA
>
> Error 16 occurred at disk power-on lifetime: 1858 hours (77 days + 10
> hours) When the command that caused the error occurred, the device
> was in an unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 51 62 db a5 68 e2 Error: UNC 98 sectors at LBA = 0x0268a5db =
> 40412635
>
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> c8 00 67 db a5 68 e2 00 00:49:20.528 READ DMA
> c8 00 08 89 ae 65 e0 00 00:49:20.512 READ DMA
> c8 00 0f 42 ae 65 e0 00 00:49:20.512 READ DMA
> c8 00 02 d6 d4 75 e0 00 00:49:20.512 READ DMA
> c8 00 03 f5 78 69 e2 00 00:49:20.512 READ DMA
>
> Error 15 occurred at disk power-on lifetime: 1455 hours (60 days + 15
> hours) When the command that caused the error occurred, the device
> was in an unknown state.
>
> After command completion occurred, registers were:
> ER ST SC SN CL CH DH
> -- -- -- -- -- -- --
> 40 51 05 a8 fb 13 e0 Error: UNC 5 sectors at LBA = 0x0013fba8 =
> 1309608
> Commands leading to the command that caused the error were:
> CR FR SC SN CL CH DH DC Powered_Up_Time Command/Feature_Name
> -- -- -- -- -- -- -- -- ---------------- --------------------
> c8 00 08 a8 fb 13 e0 00 01:04:55.104 READ DMA
> ca 00 08 3c f4 64 e2 00 01:04:55.104 WRITE DMA
> c8 00 80 d0 fb 13 e0 00 01:04:55.104 READ DMA
> c8 00 08 a8 fb 13 e0 00 01:04:53.968 READ DMA
> ca 00 80 03 de 72 e2 00 01:04:53.968 WRITE DMA
>
> SMART Self-test log structure revision number 1
> Num Test_Description Status Remaining
> LifeTime(hours) LBA_of_first_error
> # 1 Extended offline Completed: read failure 40% 3882
> 49793
> # 2 Extended offline Interrupted (host reset) 70% 3881
> -
> # 3 Short offline Completed without error 00% 0
> -
>
> SMART Selective self-test log data structure revision number 1
> SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
> 1 0 0 Not_testing
> 2 0 0 Not_testing
> 3 0 0 Not_testing
> 4 0 0 Not_testing
> 5 0 0 Not_testing
> Selective self-test flags (0x0):

> After scanning selected spans, do NOT read-scan remainder of disk.
> If Selective self-test is pending on power-up, resume after 0 minute delay.


> I am currently backing up my client's important datas right now to my USB external HDD since it appears to be dying

Yes, thats what you should be doing.

> from what the results and you're saying (not surprised from an old PC and HDD). :(

Whoops, just noticed its a Maxtor. You into necrophilia ?


From: Arno on
Ant <ant(a)zimage.comant> wrote:
> Hello!

> Am I understanding correctly that these terms are basically the same?

They are not. Sector are the storage unit of the
underlying storage device, while clusters are the Microsoft
filesystem block sizes. Clusters are typically
a multiple of the sector size (512 Bytes for HDDs and 2048
Bytes for some optical media).

> If
> so, then why did my client's updated Windows 2000 SP4's chkdsk (/r /f
> parameters and rebooted to run it) on a HDD (NTFS) in an old Dell
> Optiplex system say there was a bad cluster and was able to move a file
> to a better place, but I rerun chkdsk in Windows 2000 and ran a chkdsk
> (no parameters) and it found 0 KB of bad sector?

Well, a bad cluster is usually a cluster marked bad in the
filesystem This is a leftover artefact from the time when
HDDs diod expose their bad areas to the OS. Typically a
cluster gets marked bad if one or more sectors in it
experience a read error.

A HDD only marks a sector as bad when it cannot by extended
effoprt read that sector. However it will still try to read
it on a new request from the OS and it will not export that
marking to the OS, except in the SMART selftest log and there
only for the first found one. What the marking by the HDD is
for is to allow it to reallocated (replace with a good one)
the sector on a write.

So sectors is on the hardware and the low-level drivesr,
while clusters is in the (abstract) filesystem layer.

Arno
From: Ant on
On 10/4/2009 8:08 PM PT, Rod Speed typed:

>> I am currently backing up my client's important datas right now to my USB external HDD since it appears to be dying
>
> Yes, thats what you should be doing.
>
>> from what the results and you're saying (not surprised from an old PC and HDD). :(
>
> Whoops, just noticed its a Maxtor. You into necrophilia ?

I didn't buy and install the HDD. Someone else did and it's old since it
came with Windows 2000 SP4, Novell, Epoocrates, an internal zip drive, a
CD-RW burner drive, etc. Sheesh! I am just diagnosing the issues (good
thing Windows 2000 SP4's event log said something about the HDD) and
backing up for my client. ;)
--
"He who storms in like a whirlwind returns like an ant." --Borneo
/\___/\
/ /\ /\ \ Phil/Ant @ http://antfarm.ma.cx (Personal Web Site)
| |o o| | Ant's Quality Foraged Links (AQFL): http://aqfl.net
\ _ / Nuke ANT from e-mail address: philpi(a)earthlink.netANT
( ) or ANTant(a)zimage.com
Ant is currently not listening to any songs on his home computer.
From: Ant on
On 10/4/2009 8:58 PM PT, Arno typed:

[snipped]

> So sectors is on the hardware and the low-level drivesr,
> while clusters is in the (abstract) filesystem layer.

Thanks! That makes more sense to me now. I did notice the updated
Windows 2000 SP4's chkdsk /r /f said it found a bad cluster (think it
was TIF file but can't remember and didn't watch the long chkdsk --
wished it had logging feature or paused to tell me the results before
going back to Windows) and fixed it (moved it). Then, I did a normal
chkdsk.exe in Windows 2000 session's cmd.exe but it didn't show any bad
sectors (0 KB). I was confused there.
--
"What do ants and bees use for cattle?" --Tom
/\___/\
/ /\ /\ \ Phil/Ant @ http://antfarm.ma.cx (Personal Web Site)
| |o o| | Ant's Quality Foraged Links (AQFL): http://aqfl.net
\ _ / Nuke ANT from e-mail address: philpi(a)earthlink.netANT
( ) or ANTant(a)zimage.com
Ant is currently not listening to any songs on his home computer.