[PATCH 0/8] Suspend block api (version 8) [Kernel]

Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file

From: Alan Cox on 27 May 2010 15:00

> No, it's not. Forced suspend may be in response to hitting a key, but it

You are the only person here talking about 'forced' suspends. The rest of
us are talking about idling down and ensuring we are always in a state we
un-idle correctly.

> may also be in response to a 30 minute timeout expiring. If I get a WoL
> packet in the 0.5 of a second between userspace deciding to suspend and
> actually doing so, the system shouldn't suspend.

I don't think that argument holds water in the form you have it

What about 5 nanoseconds before you suspend. Now you can't do that (laws
of physics and stuff).

So your position would seem to be "we have a race but can debate how big
is permissible"

The usual model is

"At no point should we be in a position when entering a suspend style
deep sleep where we neither abort the suspend, nor commit to a
suspend/resume sequence if the events we care about occur"

and that is why the hardware model is

Set wake flags
Check if idle
If idle
Suspend
else
Clear wake flags
Unwind

and the wake flags guarantee that an event at any point after the wake
flags are set until they are cleared will cause a suspend to be resumed,
possibly immediately after the suspend.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Thomas Gleixner on 27 May 2010 15:00

On Thu, 27 May 2010, Matthew Garrett wrote:

> On Thu, May 27, 2010 at 07:59:02PM +0200, Thomas Gleixner wrote:
> > On Thu, 27 May 2010, Matthew Garrett wrote:
> > > ACPI provides no guarantees about what level of hardware functionality
> > > remains during S3. You don't have any useful ability to determine which
> > > events will generate wakeups. And from a purely practical point of view,
> > > since the latency is in the range of seconds, you'll never have a low
> > > enough wakeup rate to hit it.
> >
> > Right, it does not as of today. So we cannot use that on x86
> > hardware. Fine. That does not prevent us to implement it for
> > architectures which can do it. And if x86 comes to the point where it
> > can handle it as well we're going to use it. Where is the problem ? If
> > x86 cannot guarantee the wakeup sources it's not going to be used for
> > such devices. The kernel just does not provide the service for it, so
> > what ?
>
> We were talking about PCs. Suspend-as-c-state is already implemented for
> OMAP.

Ah, now we talk about PCs. And all of a sudden the problem of the
unability of determining wakeup sources is not longer relevant ? So
how do you guarantee that we don't miss one if we cant figure out
which ones are kept alive in S3 ?

> > So the only thing you are imposing to app writers is to use an
> > interface which solves nothing and does not save you any power at
> > all.
>
> It's already been demonstrated that the Android approach saves power.

Demonstrated ? Care to explain me how it makes a difference:

while (1) {
block();
read();
process_event();
unblock();
---> suspend
<--- resume
do_crap(); 1000000 cycles
}

vs.

while (1) {
read();
---> suspend
<--- resume
process_event();
do_crap(); 1000000 cycles
}

You spend the damned 10000000 cycles in any case just at a different
point in time. So if you are so convinced and have fully understood
all the implications, please enlighten me why do_crap() costs less
power with the blockers approach.

An you are also stubbornly refusing to answer my analysis about the
effect on apps which do not use the blocker or are not allowed to.

1) The kernel blocker does not guarantee that the lousy app has
processed the event. It just guarantees that the lousy app has
emptied the input queue. So what's the point of the kernel blocker
in that case ?

2) What's the difference on setting that app to QoS(NONE) and let the
kernel happily ignore it.

Come up with real explanations and numbers and not just the "it has
been demonstrated" chant which is not holding water if you look at the
above.

> > Runnable tasks and QoS guarantees are the indicators whether you can
> > go to opportunistic suspend or not. Everything else is just window
> > dressing.
>
> As I keep saying, this is all much less interesting if you don't care
> about handling suboptimal applications. If you do care about them then
> the Android approach works. Nobody has demonstrated a scheduler-based
> one that does.

That does not make the android approach any better. They should have
talked to us upfront and not after the fact. Just because they decided
to do that in their google basement w/o talking to people who care is
not proving that it's a good solution and even less a reason to merge
it as is.

The kernel history is full of examples where crappy solutions got
rejected and kept out of the kernel for a long time even if there was
a need for them in the application field and they got shipped in
quantities with out of tree patches (NOHZ, high resolution timers,
....). At some point people stopped arguing for crappy solutions and
sat down and got it right. The problem of power management and
opportunistic suspend is not different in any way.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Thomas Gleixner on 27 May 2010 15:00

On Thu, 27 May 2010, Alan Cox wrote:

> > No, it's not. Forced suspend may be in response to hitting a key, but it
>
> You are the only person here talking about 'forced' suspends. The rest of
> us are talking about idling down and ensuring we are always in a state we
> un-idle correctly.
>
> > may also be in response to a 30 minute timeout expiring. If I get a WoL
> > packet in the 0.5 of a second between userspace deciding to suspend and
> > actually doing so, the system shouldn't suspend.
>
> I don't think that argument holds water in the form you have it
>
> What about 5 nanoseconds before you suspend. Now you can't do that (laws
> of physics and stuff).
>
> So your position would seem to be "we have a race but can debate how big
> is permissible"
>
> The usual model is
>
> "At no point should we be in a position when entering a suspend style
> deep sleep where we neither abort the suspend, nor commit to a
> suspend/resume sequence if the events we care about occur"
>
> and that is why the hardware model is
>
> Set wake flags
> Check if idle
> If idle
> Suspend
> else
> Clear wake flags
> Unwind
>
> and the wake flags guarantee that an event at any point after the wake
> flags are set until they are cleared will cause a suspend to be resumed,
> possibly immediately after the suspend.

And if a platform can not guarantee the wakeup or the lossless
transition of states then you can not solve this by throwing blockers
or whatever into the code.

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Matthew Garrett on 27 May 2010 15:00

On Thu, May 27, 2010 at 07:48:40PM +0100, Alan Cox wrote:
> > The application is a network monitoring app that renders server state
> > via animated bouncing cows. The desired behaviour is that the
> > application will cease to be scheduled if the session becomes idle
> > (where idle is defined as the system having received no user input for
> > 30 seconds) but that push notifications from the server still be
> > received in order to allow the application to present the user with
> > critical alerts.
>
> This is a bit confusing - does the screen come back on for such events,
> what constraints is the server operating under ? How does your code look
> - it's hard to imagine the examples you've given as being workable given
> they would block on network packet wait when a critical event occurs.
> Are you using poll or threads or what ?

It's code that's broadly identical to what I posted. The screen will
come on if the event is critical, won't otherwise.

> > Under Android:
> >
> > User puts down phone. 30 seconds later the screen turns off and releases
> > the last user-level suspend block. The phone enters suspend and the
> > application is suspended. A network packet is received, causing the
> > network driver to take a suspend block. The application finishes the
> > frame it was drawing, takes its own suspend block and reads the network
> > packet. In doing so the network driver releases its suspend block, but
> > since userspace is holding one the phone stays awake. The application
> > then handles the event as necessary, releases its suspend block and the
> > phone goes to sleep again.
> >
> > I don't see how this behaviour can be emulated in your model.
>
> User puts down phone. 30 seconds later the X server decides to turn the
> screen off and closes the device. This probalby releases the constraint
> held via the display driver not to suspend. Any further draw requests will
> block.
>
> System looks at the other tasks and sees they are idle and can sink to a
> low power state. Cows is either blocked on a packet receive or could even
> be blocked on writing to the display (or both if its a realistic example
> and using poll)

Even if it's using poll, it could block purely on the display if X turns
the screen off between poll() waking and the write being made.

> The kernel looks at the constraints it has
> - must not sink to a state below which network receive of packets
> fails
> - must not sink below a state where whatever is needed for the
> critical alert code etc to do its stuff
> - must not sink to a state which takes more than [constraint]
> seconds to get back out of
>
> It picks 'opportunistic suspend'
> It goes to sleep
>
> A packet arrives
> It wakes the hardware
> We are busy, we do not wish to suspend
> It processes the packet
> It wakes the user app
> It starts processing the packet

If it's blocked on the write then it only starts processing the packet
again if the screen wakes up. You need to power up every piece of
hardware that an application's blocked on, just in case they need to
complete that read or write in order to get back to the event loop where
they have the opportunity to read the network packet.

So, yes, I think this can work in that case. But it doesn't work in
others - you won't idle applications that aren't accessing hardware
drivers.

As an aside, I think this is a good idea in any case since a fringe
benefit is the removal of the requirement to use the process freezer in
suspend to RAM...

> Stop transitioning Run->Forced Suspend. If you've got stuff stuck running
> then deal with it by constraining it to go idle or by blowing it out of
> the water. PM will then do the rest.

The problem is determining how to constrain it to go idle, where "idle"
is defined as "Doesn't wake up until a wakeup event is received". It's
acceptable for something to use as much CPU as it wants when the user is
actively interacting with the device, but in most cases processes
shouldn't be permitted to use any CPU when the session is idle. The
question is how to implement something that allows a CPU-guzzling
application to be idled without impairing its ability to process wakeup
events.

--
Matthew Garrett | mjg59(a)srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Alan Cox on 27 May 2010 15:00

On Thu, 27 May 2010 19:17:58 +0100
Matthew Garrett <mjg59(a)srcf.ucam.org> wrote:

> On Thu, May 27, 2010 at 08:06:38PM +0200, Peter Zijlstra wrote:
> > On Thu, 2010-05-27 at 18:59 +0100, Matthew Garrett wrote:
> > > On Thu, May 27, 2010 at 07:56:21PM +0200, Peter Zijlstra wrote:
> > > > On Thu, 2010-05-27 at 18:52 +0100, Matthew Garrett wrote:
> > > >
> > > > > If that's what you're aiming for then you don't need to block
> > > > > applications on hardware access because they should all already have
> > > > > idled themselves.
> > > >
> > > > Correct, a well behaved app would have. I thought we all agreed that
> > > > well behaved apps weren't the problem?
> > >
> > > Ok. So the existing badly-behaved application ignores your request and
> > > then gets blocked. And now it no longer responds to wakeup events.
> >
> > It will, when it gets unblocked from whatever thing it got stuck on.
>
> It's blocked on the screen being turned off. It's supposed to be reading
> a network packet. How does it ever get to reading the network packet?

Thats a stupid argument. If you write broken code then it doesn't work.
You know if I do

ls < unopenedfifo

it blocks too.

There is a difference between dealing with apps that overconsume
resources and arbitarily broken code (which your suspend blocker case
doesn't fix either but makes worse).

Can we stick to sane stuff ?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Prev: [ANN] Linux Security Summit 2010 - Announcement and CFP
Next: [PATCH 4/8] PM: suspend_block: Add debugfs file