From: Mark Knecht on
On Wed, Jul 7, 2010 at 9:19 AM, Tejun Heo <tj(a)kernel.org> wrote:
> On 07/07/2010 06:15 PM, Mark Knecht wrote:
>> Certainly. Is there a way to reverse the previous patch?
>>
>> c2stable linux # patch -p1 --dry-run <~mark/Downloads/resume-dbg-1.patch
>> patching file drivers/ata/libata-core.c
>> Hunk #1 succeeded at 3798 (offset 86 lines).
>> Hunk #2 succeeded at 3833 with fuzz 2 (offset 94 lines).
>> Hunk #3 FAILED at 6109.
>> 1 out of 3 hunks FAILED -- saving rejects to file drivers/ata/libata-core.c.rej
>> c2stable linux #
>>
>> I assume this is failing because your patch is over the plain kernel,
>> not the one I've patched?
>
> $ patch -R -p1 < ~mark/Downloads/resume-dbg.patch
> $ patch -p1 < ~mark/Downloads/resume-dbg-1.patch
>
> --
> tejun
>
Thanks. Building the new kernel now. I'll start trying to save the
data you're looking for.

Cheers,
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 07/07/2010 07:06 PM, Mark Knecht wrote:
> 4 warm reboots. All 4 said 300 300. However the 4th one only showed an
> extra attempt at running the patch code with and also showed Tries =
> 2. I'm attaching boot #1 and boot #4 for now. I've saved them all if
> you need or just want them.
>
> Please note that in all 4 cases all drives were found. Nothing is
> missing in any test yet after adding these either of these patches.
>
> I've not tried cold boots yet. That's next.

Hmm... just in case you're being lucky, please keep an eye on it over
several days and report the result. I think all that's necessary is
slight modification to the resume logic but let's watch a bit first.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mark Knecht on
On Wed, Jul 7, 2010 at 10:26 AM, Tejun Heo <tj(a)kernel.org> wrote:
> Hello,
>
> On 07/07/2010 07:06 PM, Mark Knecht wrote:
>> 4 warm reboots. All 4 said 300 300. However the 4th one only showed an
>> extra attempt at running the patch code with and also showed Tries =
>> 2. I'm attaching boot #1 and boot #4 for now. I've saved them all if
>> you need or just want them.
>>
>> Please note that in all 4 cases all drives were found. Nothing is
>> missing in any test yet after adding these either of these patches.
>>
>> I've not tried cold boots yet. That's next.
>
> Hmm... just in case you're being lucky, please keep an eye on it over
> several days and report the result.  I think all that's necessary is
> slight modification to the resume logic but let's watch a bit first.
>
> Thanks.
>
> --
> tejun
>
I've tried two cold boots so far. One of them had that same extra
Tries = 2 at the same place. I'll do a couple more just to see what
happens.

I'm happy to watch it as long as it takes and will certainly save any
results if and when a drive isn't found. This problem has run hot and
cold for a few months. Sometimes I go a week with every boot is good.
This last two weeks was finally bad enough to get me to report it.

I also want to investigate actually getting the BIOS AHCI setting
working, but I think I'll leave that alone for afew days so as to not
upset this experiment. Performance on the machine isn't critical right
now, assuming AHCI helps.

I'll get back to you when I've got something new to report.

Cheers,
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mark Knecht on
On Wed, Jul 7, 2010 at 10:26 AM, Tejun Heo <tj(a)kernel.org> wrote:
> Hello,
>
> On 07/07/2010 07:06 PM, Mark Knecht wrote:
>> 4 warm reboots. All 4 said 300 300. However the 4th one only showed an
>> extra attempt at running the patch code with and also showed Tries =
>> 2. I'm attaching boot #1 and boot #4 for now. I've saved them all if
>> you need or just want them.
>>
>> Please note that in all 4 cases all drives were found. Nothing is
>> missing in any test yet after adding these either of these patches.
>>
>> I've not tried cold boots yet. That's next.
>
> Hmm... just in case you're being lucky, please keep an eye on it over
> several days and report the result.  I think all that's necessary is
> slight modification to the resume logic but let's watch a bit first.
>
> Thanks.
>
> --
> tejun
>

Tejun,
With about 10-12 day of testing, 1-2 boots/day, I've not had a
single boot failure since adding the patch. Only twice has it said
tries=2. Every other time it's tries=1. The machine seems to work fine
either way.

Thanks,
Mark
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 07/19/2010 09:31 PM, Mark Knecht wrote:
> With about 10-12 day of testing, 1-2 boots/day, I've not had a
> single boot failure since adding the patch. Only twice has it said
> tries=2. Every other time it's tries=1. The machine seems to work fine
> either way.

Hmmm... can you please test the attached patch instead? It seems
likely that the root cause is not flakiness of SIDPR but incorrect
locking in libata EH code.

Thanks.

--
tejun