From: yuko on
Hi everybody

I know sdparm can set RTL "recovery time limit" (a thing like WD's "TLER")
Has anybody ever set it on an Hitachi SATA drive? (I'd have to do this
on a 7K1000 first edition not 7K1000.B)

http://www.hitachigst.com/tech/techlib.nsf/techdocs/CF6C5D47F5BCE65B862574C1007E985D/$file/DS_CS_7K1000.B_Spec_rev3.0.pdf
(this is for 7K1000.B, I couldn't find it for 7K1000)
at paragraph 9.21.3.2 there is written that recovery time limit is in
100ms units, while sdparm seems to use that exact same command but wants
milliseconds. Should I enter it as milliseconds?

Also, at the same paragraph and at paragraph 7.3 there is written the
minimum time limit is 6.5 seconds, but I don't understand if that's only
for the case of the drive spinning up (see para 7.3) or it's a global limit.

Another question: do you know the relationship between RTL and RRC (read
retry count)? is the lowest of the two that takes precedence? With an
RTL of 6.5 seconds I'd expect the drive to make thousands of read-retry
attempts!?

Another question: I have a 3ware controller that does not allow me to
use sdparm. I could use an USB to SATA converter... do you think it
would work? Does it depend on the controller integrated in the USB
converter? I tried on an old 3.5" Samsung 500GB of mine and it didn't
work, sdparm wasn't able to set the RTL or any other value that came to
my mind... was that the Samsung not supporting the Read-Write Error
Recovery Options or the USB controller not supporting them?

Another question: in
sdparm --enumerate --page=rw
options are like this:
RRC [0x03:7:8 ] Read retry count
RTL [0x0a:7:16] Recovery time limit (ms)
what does the stuff between the [ ] mean? What are the values 0x03 7 8
0x0a 7 16 ?

Sorry if these seem stupid questions like "just try, don't ask" but I'd
have to do this on the disks of a server full of important data...

Thank you for any knowledge
From: Franc Zabkar on
On Thu, 03 Dec 2009 23:52:15 +0100, yuko <yuko(a)nowhere.org> put finger
to keyboard and composed:

>I know sdparm can set RTL "recovery time limit" (a thing like WD's "TLER")
>Has anybody ever set it on an Hitachi SATA drive? (I'd have to do this
>on a 7K1000 first edition not 7K1000.B)
>
>http://www.hitachigst.com/tech/techlib.nsf/techdocs/CF6C5D47F5BCE65B862574C1007E985D/$file/DS_CS_7K1000.B_Spec_rev3.0.pdf
>(this is for 7K1000.B, I couldn't find it for 7K1000)
>at paragraph 9.21.3.2 there is written that recovery time limit is in
>100ms units, while sdparm seems to use that exact same command but wants
>milliseconds. Should I enter it as milliseconds?
>
>Also, at the same paragraph and at paragraph 7.3 there is written the
>minimum time limit is 6.5 seconds, but I don't understand if that's only
>for the case of the drive spinning up (see para 7.3) or it's a global limit.
>
>Another question: do you know the relationship between RTL and RRC (read
>retry count)? is the lowest of the two that takes precedence? With an
>RTL of 6.5 seconds I'd expect the drive to make thousands of read-retry
>attempts!?

AIUI, Western Digital's TLER, Samsung's CCTL, and Seagate's ERC are
defined in the following document:

Working Draft AT Attachment 8 - ATA/ATAPI Command Set (ATA8-ACS):
http://www.t13.org/Documents/UploadedDocuments/docs2008/D1699r6a-ATA8-ACS.pdf

See "SCT Error Recovery Control" in section 8.3.4.

To confirm whether ERC is supported by the drive, see section 4.1.

Bit 3 (Error Recovery Control, AC3, supported) and bit 0 (SCT Feature
Set supported) of word 206 (SCT Command set support) in the Identify
Device data block should be set to 1.

The "Recovery Time Limit" is set in word 3 of the SCT Error Recovery
Control command:

"If the Function Code is 0001h then this field contains the recovery
time limit in 100 ms units (e.g., a value of 1 = 100 ms, 2 = 200 ms).
The tolerance is vendor specific."

>Another question: I have a 3ware controller that does not allow me to
>use sdparm. I could use an USB to SATA converter... do you think it
>would work? Does it depend on the controller integrated in the USB
>converter? I tried on an old 3.5" Samsung 500GB of mine and it didn't
>work, sdparm wasn't able to set the RTL or any other value that came to
>my mind... was that the Samsung not supporting the Read-Write Error
>Recovery Options or the USB controller not supporting them?

The ERC commands can be "tunnelled" inside SMART READ LOG and SMART
WRITE LOG commands by treating them as data. A drive that supports ERC
will recognise the ERC commands within the data and process them as
such. Some USB-SATA bridge chips do not support SMART commands, so
presumably they will not be able to tunnel ERC data either.

However, the ATA standard provides an alternative method, by using
READ LOG (DMA) EXT and WRITE LOG (DMA) EXT commands. See page 301 of
the abovementioned document.

>Another question: in
>sdparm --enumerate --page=rw
>options are like this:
> RRC [0x03:7:8 ] Read retry count
> RTL [0x0a:7:16] Recovery time limit (ms)
>what does the stuff between the [ ] mean? What are the values 0x03 7 8
>0x0a 7 16 ?

sdparm(8) - Linux man page:
http://linux.die.net/man/8/sdparm

The command ...

sdparm -enumerate --page=rw

.... "lists contents of read write error recovery mode page".

This document has some information on page 4:
http://www.t10.org/ftp/t10/document.05/05-374r2.pdf

Table 3 indicates that the READ RETRY COUNT occupies byte #0x03 of the
Read-Write Error Recovery mode page.

The RECOVERY TIME LIMIT is at bytes 0x0A and 0x0B.

Here is the corresponding documentation on the sdparm man page:

"When known parameters (fields) of a mode page are listed, each line
starts with an acronym (indented a few spaces). This will match (or be
an acronym for) the description for that field found in the (draft)
standards. Next are three numbers, separated by colons, surrounded by
brackets. These are the start byte (in hex, prefixed by "0x") of the
beginning of the field within the mode page; the starting bit (0
through 7 inclusive) and then the number of bits. The descriptive name
of the parameter (field) is then given. If appropriate the descriptive
name includes units (e.g. "(ms)" means the units are milliseconds).
Adding the '-ll' option will list information about possible field
values for selected mode page parameters."

BTW, HDAT2 now has support for ERC:
http://www.hdat2.com/

- Franc Zabkar
--
Please remove one 'i' from my address when replying by email.
From: Franc Zabkar on
On Fri, 11 Dec 2009 13:37:47 +1100, Franc Zabkar
<fzabkar(a)iinternode.on.net> put finger to keyboard and composed:

>AIUI, Western Digital's TLER, Samsung's CCTL, and Seagate's ERC are
>defined in the following document:
>
>Working Draft AT Attachment 8 - ATA/ATAPI Command Set (ATA8-ACS):
>http://www.t13.org/Documents/UploadedDocuments/docs2008/D1699r6a-ATA8-ACS.pdf
>
>See "SCT Error Recovery Control" in section 8.3.4.

Section 5.3 of the following document also describes the standard.

Information Technology - SMART Command Transport (SCT):
http://www.t13.org/Documents/UploadedDocuments/docs2005/DT1701r5-SCT.pdf

The following Wikipedia article also has some information.

Time-Limited Error Recovery:
http://en.wikipedia.org/wiki/TLER

- Franc Zabkar
--
Please remove one 'i' from my address when replying by email.