From: Christian Franke on
David Brown wrote:
>
> If I try a smartctl test ("smartctl -t short /dev/sda") on a standby
> drive, however, it fails - it will wake up the drive, but smartctl gives
> up waiting and returns an error message before the drive is ready. One
> the drive is up to speed, the test runs fine.
>

This likely occurs because the of the SCSI timeout value of 6 seconds
used by smartctl for the SAT ATA PASS-THROUGH command.

This is too short to spin up a disk. For example, a 1TB Samsung drive
(HD103UJ) spins up in ~9 seconds.

I fixed this in smartmontools r2924, timeout is now 20 seconds. Please
try current code from SVN repository:
http://sourceforge.net/apps/trac/smartmontools/wiki/Download

Thanks for the problem report.

Christian
From: Arno on
David Brown <david(a)westcontrol.removethisbit.com> wrote:
[...]
> Here's a transcript:


> host:~# hdparm -y /dev/sda

> /dev/sda:
> issuing standby command
> host:~# hdparm -C /dev/sda

> /dev/sda:
> drive state is: standby


> host:~# smartctl -t short /dev/sda
> smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
> Bruce Allen
> Home page is http://smartmontools.sourceforge.net/

> === START OF OFFLINE IMMEDIATE AND SELF-TEST SECTION ===
> Sending command: "Execute SMART Short self-test routine immediately in
> off-line mode".
> Command "Execute SMART Short self-test routine immediately in off-line
> mode" failed


> host:~# smartctl -c /dev/sda
> smartctl version 5.38 [x86_64-unknown-linux-gnu] Copyright (C) 2002-8
> Bruce Allen
> Home page is http://smartmontools.sourceforge.net/

> === START OF READ SMART DATA SECTION ===
> General SMART Values:
> Offline data collection status: (0x00) Offline data collection activity
> was never started.
> Auto Offline Data Collection:
> Disabled.
> Self-test execution status: ( 41) The self-test routine was
> interrupted
> by the host with a hard or soft
> reset.




> syslog gives the following:

> Sep 26 20:50:23 host kernel: [ 9865.466082] ata1.00: exception Emask 0x0
> SAct 0x0 SErr 0x0 action 0x6
> frozen
> Sep 26 20:50:23 host kernel: [ 9865.466110] ata1.00: cmd
> b0/d4:00:01:4f:c2/00:00:00:00:00/00 tag 0
> Sep 26 20:50:23 host kernel: [ 9865.466111] res
> 40/00:00:01:4f:c2/00:00:00:00:00/00 Emask 0x4 (timeout)

Ok, so it is the kernel. With Christians post, it seems the command is
given to the kernel with a timeout parameter. I would deduce
that a normal access either has a larger value for the timeout
or will wait longer in case the drive is not active.

Anyways, Christian seems to have fixed this.

Arno
From: David Brown on
Christian Franke wrote:
> David Brown wrote:
>>
>> If I try a smartctl test ("smartctl -t short /dev/sda") on a standby
>> drive, however, it fails - it will wake up the drive, but smartctl
>> gives up waiting and returns an error message before the drive is
>> ready. One the drive is up to speed, the test runs fine.
>>
>
> This likely occurs because the of the SCSI timeout value of 6 seconds
> used by smartctl for the SAT ATA PASS-THROUGH command.
>
> This is too short to spin up a disk. For example, a 1TB Samsung drive
> (HD103UJ) spins up in ~9 seconds.
>

This sounds like a very likely explanation.

> I fixed this in smartmontools r2924, timeout is now 20 seconds. Please
> try current code from SVN repository:
> http://sourceforge.net/apps/trac/smartmontools/wiki/Download
>
> Thanks for the problem report.
>
> Christian

Many thanks! I'll give this a try this evening, if I get the chance.

mvh.,

David