From: Stephan Diestelhorst on
On Wednesday 28 July 2010, 23:50:09 Rafael J. Wysocki wrote:
> On Saturday, July 10, 2010, Tejun Heo wrote:
> > On 07/10/2010 08:50 AM, Stephan Diestelhorst wrote:
> > >> I have a box where this problem is kind of reproducible, but it happens _very_
> > >> rarely. Also I can't reproduce it on demand running suspend-resume in a tight
> > >> loop. Are you able to reproduce it more regurarly?
> > >
> > > For me it is much more reproducible. If I run multiple direct writing
> > > dd-s to the disk in question I trigger it rather reliably (~75% or
> > > higher). See the attached script from an earlier email.
> > > Maybe that helps triggering your case more reliabl, too?
> >
> That didn't help, but the appended patch fixes the problem for me.

<snip>

Sorry for taking ages. Vacation and catching up after it are to blame,
as is me forgetting to build a proper initrd...

Thanks for the patch! It certainly changes behaviour, however, in a
very strange way for me. With your patch my machine does not suspend
to ram anymore (a simple echo mem > /proc/sys/state blocks), and
nothing happens in dmesg if there is a lot of write I/O while
suspending. (A number of parallel dd's with oflag=direct)

If I stop the I/O, the system eventually goes into suspend to RAM.
However, that takes a while, after the I/O has stopped, and also
from "Preparing system for suspend" log entry until it is actually
done.

Is this intentional? Let me know how I can debug this further!
Ideally I'd like to be able to suspend the machine under I/O load,
too. (E.g. during a compile job.)

Can you reproduce this at your end, too?

Many thanks,
Stephan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Monday, August 02, 2010, Stephan Diestelhorst wrote:
> On Wednesday 28 July 2010, 23:50:09 Rafael J. Wysocki wrote:
> > On Saturday, July 10, 2010, Tejun Heo wrote:
> > > On 07/10/2010 08:50 AM, Stephan Diestelhorst wrote:
> > > >> I have a box where this problem is kind of reproducible, but it happens _very_
> > > >> rarely. Also I can't reproduce it on demand running suspend-resume in a tight
> > > >> loop. Are you able to reproduce it more regurarly?
> > > >
> > > > For me it is much more reproducible. If I run multiple direct writing
> > > > dd-s to the disk in question I trigger it rather reliably (~75% or
> > > > higher). See the attached script from an earlier email.
> > > > Maybe that helps triggering your case more reliabl, too?
> > >
> > That didn't help, but the appended patch fixes the problem for me.
>
> <snip>
>
> Sorry for taking ages. Vacation and catching up after it are to blame,
> as is me forgetting to build a proper initrd...
>
> Thanks for the patch! It certainly changes behaviour, however, in a
> very strange way for me. With your patch my machine does not suspend
> to ram anymore (a simple echo mem > /proc/sys/state blocks), and
> nothing happens in dmesg if there is a lot of write I/O while
> suspending. (A number of parallel dd's with oflag=direct)
>
> If I stop the I/O, the system eventually goes into suspend to RAM.
> However, that takes a while, after the I/O has stopped, and also
> from "Preparing system for suspend" log entry until it is actually
> done.
>
> Is this intentional?

It surely isn't.

> Let me know how I can debug this further!
> Ideally I'd like to be able to suspend the machine under I/O load,
> too. (E.g. during a compile job.)
>
> Can you reproduce this at your end, too?

Well, I didn't try suspending with a number of parallel dd's with oflag=direct
in the background, but otherwise I'm not reproducing the issue with
the patch applied.

Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stephan Diestelhorst on
On Monday 02 August 2010, 23:38:05 Rafael J. Wysocki wrote:
> On Monday, August 02, 2010, Stephan Diestelhorst wrote:
> > On Wednesday 28 July 2010, 23:50:09 Rafael J. Wysocki wrote:
> > > On Saturday, July 10, 2010, Tejun Heo wrote:
> > > > On 07/10/2010 08:50 AM, Stephan Diestelhorst wrote:
> > > > >> I have a box where this problem is kind of reproducible, but it happens _very_
> > > > >> rarely. Also I can't reproduce it on demand running suspend-resume in a tight
> > > > >> loop. Are you able to reproduce it more regurarly?
> > > > >
> > > > > For me it is much more reproducible. If I run multiple direct writing
> > > > > dd-s to the disk in question I trigger it rather reliably (~75% or
> > > > > higher). See the attached script from an earlier email.
> > > > > Maybe that helps triggering your case more reliabl, too?
> > > >
> > > That didn't help, but the appended patch fixes the problem for me.
> >
> > <snip>
> >
> > Sorry for taking ages. Vacation and catching up after it are to blame,
> > as is me forgetting to build a proper initrd...
> >
> > Thanks for the patch! It certainly changes behaviour, however, in a
> > very strange way for me. With your patch my machine does not suspend
> > to ram anymore (a simple echo mem > /proc/sys/state blocks), and
> > nothing happens in dmesg if there is a lot of write I/O while
> > suspending. (A number of parallel dd's with oflag=direct)
> >
> > If I stop the I/O, the system eventually goes into suspend to RAM.
> > However, that takes a while, after the I/O has stopped, and also
> > from "Preparing system for suspend" log entry until it is actually
> > done.
> >
> > Is this intentional?
>
> It surely isn't.
>
> > Let me know how I can debug this further!
> > Ideally I'd like to be able to suspend the machine under I/O load,
> > too. (E.g. during a compile job.)
> >
> > Can you reproduce this at your end, too?
>
> Well, I didn't try suspending with a number of parallel dd's with oflag=direct
> in the background, but otherwise I'm not reproducing the issue with
> the patch applied.

Mhmhm, I have tried to reproduce my issue again, and also added some
dev_printk's around your code to understand where the delay is
happening.

However, I have not been able to reproduce the issue (with and without
the debug output) anymore, and I am happy to report that for now your
patch helps.

I'd like to keep this under observation for a little while longer,
though.

Many thanks,
Stephan

--
Stephan Diestelhorst, AMD Operating System Research Center
stephan.diestelhorst(a)amd.com, Tel. +49 (0)351 448 356 719

Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo, Andrew Bowd
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Tuesday, August 03, 2010, Stephan Diestelhorst wrote:
> On Monday 02 August 2010, 23:38:05 Rafael J. Wysocki wrote:
> > On Monday, August 02, 2010, Stephan Diestelhorst wrote:
> > > On Wednesday 28 July 2010, 23:50:09 Rafael J. Wysocki wrote:
> > > > On Saturday, July 10, 2010, Tejun Heo wrote:
> > > > > On 07/10/2010 08:50 AM, Stephan Diestelhorst wrote:
> > > > > >> I have a box where this problem is kind of reproducible, but it happens _very_
> > > > > >> rarely. Also I can't reproduce it on demand running suspend-resume in a tight
> > > > > >> loop. Are you able to reproduce it more regurarly?
> > > > > >
> > > > > > For me it is much more reproducible. If I run multiple direct writing
> > > > > > dd-s to the disk in question I trigger it rather reliably (~75% or
> > > > > > higher). See the attached script from an earlier email.
> > > > > > Maybe that helps triggering your case more reliabl, too?
> > > > >
> > > > That didn't help, but the appended patch fixes the problem for me.
> > >
> > > <snip>
> > >
> > > Sorry for taking ages. Vacation and catching up after it are to blame,
> > > as is me forgetting to build a proper initrd...
> > >
> > > Thanks for the patch! It certainly changes behaviour, however, in a
> > > very strange way for me. With your patch my machine does not suspend
> > > to ram anymore (a simple echo mem > /proc/sys/state blocks), and
> > > nothing happens in dmesg if there is a lot of write I/O while
> > > suspending. (A number of parallel dd's with oflag=direct)
> > >
> > > If I stop the I/O, the system eventually goes into suspend to RAM.
> > > However, that takes a while, after the I/O has stopped, and also
> > > from "Preparing system for suspend" log entry until it is actually
> > > done.
> > >
> > > Is this intentional?
> >
> > It surely isn't.
> >
> > > Let me know how I can debug this further!
> > > Ideally I'd like to be able to suspend the machine under I/O load,
> > > too. (E.g. during a compile job.)
> > >
> > > Can you reproduce this at your end, too?
> >
> > Well, I didn't try suspending with a number of parallel dd's with oflag=direct
> > in the background, but otherwise I'm not reproducing the issue with
> > the patch applied.
>
> Mhmhm, I have tried to reproduce my issue again, and also added some
> dev_printk's around your code to understand where the delay is
> happening.
>
> However, I have not been able to reproduce the issue (with and without
> the debug output) anymore, and I am happy to report that for now your
> patch helps.

Good.

What you might be seeing is that the patch generally changes the timing of
suspend and since it is done asynchronously by default the change might trigger
an independent bug that was sensitive to timing.

> I'd like to keep this under observation for a little while longer, though.

You can try to remove the noise produced by asynchronous suspend from the
picture by dong "echo 0 > /sys/power/pm_async" (just once after bootup).

Thanks,
Rafael
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/