Lots of mapped out bad sectors cause trouble? [Storage]

Prev: SSD SMART attributes
Next: Hitachi RMA response - how fast?

From: jack smith on 20 Oct 2009 07:11

AIUI ... a bad sector on an IDE hard drive gets labelled as unusable
in the drive's map and a substitute sector is found. The user
wouldn't know about it because this happens transparently.

If there are LOTS of bad sectors then couldn't we have a situation
where the drive's performance is poor but there is no indication in
running drive diagnostics that there's anything wrong?

I have some hard drives which are much slower than similar ones.
Could a very large number of mapped out bad sectors be a *likely*
explanation for this?

BACKGROUND: The difference is most easily observable when I do an
online defrag of NTFS's own files (such as $MFT). The defragger
checks for and locks all data files before performing its defrag.
The speed it does this varies enormously between drives.

The difference seems to be of another order of magnitude in size to
the differences which might be due to model, firmware level, type of
data, file system, etc.

From: Grant on 20 Oct 2009 08:14

On Tue, 20 Oct 2009 12:11:07 +0100, jack smith <invalid(a)mail.com> wrote:

>AIUI ... a bad sector on an IDE hard drive gets labelled as unusable
>in the drive's map and a substitute sector is found. The user
>wouldn't know about it because this happens transparently.

Apart from the etra time taken to seek to the reserve track and back?
>
>If there are LOTS of bad sectors then couldn't we have a situation
>where the drive's performance is poor but there is no indication in
>running drive diagnostics that there's anything wrong?

smart will show it, for example (with comments):
....
SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x000f 067 062 006 Pre-fail Always - 94656318

Equals Hardware_ECC_Recovered --> okay

3 Spin_Up_Time 0x0003 097 097 000 Pre-fail Always - 0
4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 126
5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0

No reallocated sectors

7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail Always - 312238221
9 Power_On_Hours 0x0032 059 059 000 Old_age Always - 36285

Over four years spinning away

10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
12 Power_Cycle_Count 0x0032 099 099 020 Old_age Always - 1785
194 Temperature_Celsius 0x0022 037 053 000 Old_age Always - 37

37'C now, max was 53'C -- okay, no overheating.

195 Hardware_ECC_Recovered 0x001a 067 062 000 Old_age Always - 94656318
197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x003e 200 199 000 Old_age Always - 2

Oops, bumped the data cable

200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age Offline - 0
202 TA_Increase_Count 0x0032 100 253 000 Old_age Always - 0
>
>I have some hard drives which are much slower than similar ones.
>Could a very large number of mapped out bad sectors be a *likely*
>explanation for this?

Yep, or retries on an iffy sector. The drive cannot remap an internal
sector until OS asks to write the entire internal sector size, which
is different to the 'reported' sector size (the drive lies about its
physical geometry to the OS).

>BACKGROUND: The difference is most easily observable when I do an
>online defrag of NTFS's own files (such as $MFT). The defragger
>checks for and locks all data files before performing its defrag.
>The speed it does this varies enormously between drives.

You're measuring something else, time taken to read a file depends on
fragmentation and seek distance between fragments.

Performing a sequential read of the drive surface[1] would be a better
test as you could then listen for the seeking to remapped sectors.

[1] dd if=/dev/sda bs=1M of=/dev/null # at a unix-like command prompt

If yoy don't run linux or unix, try a recent Live Linux cd.
>
>The difference seems to be of another order of magnitude in size to
>the differences which might be due to model, firmware level, type of
>data, file system, etc.

Yes, read retries will dominate as the drive will do retries, then
the OS might also ask for more retries. HDD seek time is next if a
file is severely fragged or has many relocated sectors -- but a
modern drive with relocated sectors is on the way out and should
be replaced.

You may recover a HDD by writing zeroes to the entire drive, this
is the modern equivalent to 'low level format'.

It's easy in linux:

dd if=/dev/zero bs=1M of=/dev/sdX

For 'doze use the manufacturer's bootable CD image drive fixer --
it does the same thing a bit differently. The process gives the
drive smarts a chance to remap any iffy sectors.

Grant.
--
http://bugsplatter.id.au

From: Rod Speed on 20 Oct 2009 14:53

jack smith wrote:

> AIUI ... a bad sector on an IDE hard drive gets labelled as unusable
> in the drive's map and a substitute sector is found. The user
> wouldn't know about it because this happens transparently.

That last isnt necessarily true. If its bad on a read, it
wont get transparently remapped until its written to.

> If there are LOTS of bad sectors then couldn't we have a situation
> where the drive's performance is poor but there is no indication in
> running drive diagnostics that there's anything wrong?

No, essentially because remapping doesnt necessarily affect performance.

In practice the drive turns the LBA into CHS values mathematically
and the remapped sectors are just part of that maths, and so that
has no effect on performance.

And if a drive does have a large number of bads, its dying, and
will be retrying on the not completely bad sectors, so that will
have a much bigger effect on preformance, particularly the retrys.

> I have some hard drives which are much slower than similar ones.
> Could a very large number of mapped out bad sectors be a *likely*
> explanation for this?

Very unlikely. Most likely they are just retrying on the not completely bads.

> BACKGROUND: The difference is most easily observable when
> I do an online defrag of NTFS's own files (such as $MFT). The
> defragger checks for and locks all data files before performing its
> defrag. The speed it does this varies enormously between drives.

That can be due to other effects. Some defraggers vary very
significantly speed wise just on the file detail, not the physical drive.

It can also just be that what look to you like similar drives are very
different physically, particularly sectors per track and seek times.

> The difference seems to be of another order of magnitude in size to
> the differences which might be due to model, firmware level, type of
> data, file system, etc.

Post the Everest SMART stats on the best and worst drives.
http://www.majorgeeks.com/download.php?det=4181
That will at least show what bad sectors the drives have.

From: Rod Speed on 20 Oct 2009 14:56

Grant wrote
> jack smith <invalid(a)mail.com> wrote

>> AIUI ... a bad sector on an IDE hard drive gets labelled as unusable
>> in the drive's map and a substitute sector is found. The user
>> wouldn't know about it because this happens transparently.

> Apart from the etra time taken to seek to the reserve track and back?

There is no reserve track with modern drives.

>> If there are LOTS of bad sectors then couldn't we have a situation
>> where the drive's performance is poor but there is no indication in
>> running drive diagnostics that there's anything wrong?

> smart will show it, for example (with comments):
> ...
> SMART Attributes Data Structure revision number: 10
> Vendor Specific SMART Attributes with Thresholds:
> ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE
> UPDATED WHEN_FAILED RAW_VALUE 1 Raw_Read_Error_Rate 0x000f
> 067 062 006 Pre-fail Always - 94656318
>
> Equals Hardware_ECC_Recovered --> okay
>
> 3 Spin_Up_Time 0x0003 097 097 000 Pre-fail
> Always - 0 4 Start_Stop_Count 0x0032 100 100
> 020 Old_age Always - 126 5 Reallocated_Sector_Ct
> 0x0033 100 100 036 Pre-fail Always - 0
>
> No reallocated sectors
>
> 7 Seek_Error_Rate 0x000f 084 060 030 Pre-fail
> Always - 312238221 9 Power_On_Hours 0x0032
> 059 059 000 Old_age Always - 36285
>
> Over four years spinning away
>
> 10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail
> Always - 0 12 Power_Cycle_Count 0x0032 099 099
> 020 Old_age Always - 1785 194 Temperature_Celsius
> 0x0022 037 053 000 Old_age Always - 37
>
> 37'C now, max was 53'C -- okay, no overheating.
>
> 195 Hardware_ECC_Recovered 0x001a 067 062 000 Old_age
> Always - 94656318 197 Current_Pending_Sector 0x0012
> 100 100 000 Old_age Always - 0 198
> Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline
> - 0 199 UDMA_CRC_Error_Count 0x003e 200 199 000
> Old_age Always - 2
>
> Oops, bumped the data cable
>
> 200 Multi_Zone_Error_Rate 0x0000 100 253 000 Old_age
> Offline - 0 202 TA_Increase_Count 0x0032 100 253
> 000 Old_age Always - 0
>>
>> I have some hard drives which are much slower than similar ones.
>> Could a very large number of mapped out bad sectors be a *likely*
>> explanation for this?
>
> Yep, or retries on an iffy sector. The drive cannot remap an internal
> sector until OS asks to write the entire internal sector size, which
> is different to the 'reported' sector size (the drive lies about its
> physical geometry to the OS).
>
>> BACKGROUND: The difference is most easily observable when I do an
>> online defrag of NTFS's own files (such as $MFT). The defragger
>> checks for and locks all data files before performing its defrag.
>> The speed it does this varies enormously between drives.
>
> You're measuring something else, time taken to read a file depends on
> fragmentation and seek distance between fragments.
>
> Performing a sequential read of the drive surface[1] would be a better
> test as you could then listen for the seeking to remapped sectors.
>
> [1] dd if=/dev/sda bs=1M of=/dev/null # at a unix-like command prompt
>
> If yoy don't run linux or unix, try a recent Live Linux cd.
>>
>> The difference seems to be of another order of magnitude in size to
>> the differences which might be due to model, firmware level, type of
>> data, file system, etc.
>
> Yes, read retries will dominate as the drive will do retries, then
> the OS might also ask for more retries. HDD seek time is next if a
> file is severely fragged or has many relocated sectors -- but a
> modern drive with relocated sectors is on the way out and should
> be replaced.
>
> You may recover a HDD by writing zeroes to the entire drive, this
> is the modern equivalent to 'low level format'.
>
> It's easy in linux:
>
> dd if=/dev/zero bs=1M of=/dev/sdX
>
> For 'doze use the manufacturer's bootable CD image drive fixer --
> it does the same thing a bit differently. The process gives the
> drive smarts a chance to remap any iffy sectors.
>
> Grant.

From: Arno on 20 Oct 2009 15:26

In comp.sys.ibm.pc.hardware.storage jack smith <invalid(a)mail.com> wrote:
> AIUI ... a bad sector on an IDE hard drive gets labelled as unusable
> in the drive's map and a substitute sector is found. The user
> wouldn't know about it because this happens transparently.

> If there are LOTS of bad sectors then couldn't we have a situation
> where the drive's performance is poor but there is no indication in
> running drive diagnostics that there's anything wrong?

You can get poor performance, however not really from the remapping.
You get it while the drive is trying to recover the sectors it is
eventually going to remap.

But you can get this diagnised, just look at the SMART attribute
for remapped sector count raw value.

> I have some hard drives which are much slower than similar ones.
> Could a very large number of mapped out bad sectors be a *likely*
> explanation for this?

Not really.

> BACKGROUND: The difference is most easily observable when I do an
> online defrag of NTFS's own files (such as $MFT). The defragger
> checks for and locks all data files before performing its defrag.
> The speed it does this varies enormously between drives.

> The difference seems to be of another order of magnitude in size to
> the differences which might be due to model, firmware level, type of
> data, file system, etc.

Drives at most remap a few 1000 sectors. That is not enough for a
strong performance degradation. Drives that remap this many
sectors are also typically in the process of dying.

You likely have a different issue, or it may just be natural speed
difference.

Arno

| Next | Last
Pages: 1 2
Prev: SSD SMART attributes
Next: Hitachi RMA response - how fast?