From: Neil Brown on
On Tue, 1 Jun 2010 03:49:37 +0200 (CEST)
Thomas Gleixner <tglx(a)linutronix.de> wrote:

> On Tue, 1 Jun 2010, Neil Brown wrote:
> > And if you are right that the race window cannot be closed, then the whole
> > suspend-blocker infrastructure is pointless as the purpose of it is simply to
> > close that window. If it really does not and cannot work, then it would be
> > best to reject it for that reason rather than for less concrete aesthetic
> > arguments.
> > But presumably it does work in practice on Android hardware so ..... confused.
> >
> > Having just seen the email from Thomas, maybe you mean that it cannot be
> > closed on devices using ACPI, but can on other devices. I can sort-of
> > imagine how that would be the case (I tried reading an ACPI spec once - my
> > hat is of to those of you who understand it).
> > That shouldn't prevent us from closing the race window on "sane" hardware
> > that allows it. This would, I think, be sufficient for Android's needs.
> >
> > I'm hoping we can get agreement on:
> > - there is a race with suspend
>
> That's a matter of how you define "suspend".

I define "suspend" as

echo mem > /sys/power/state

>
> If "suspend" is another deep idle state and the hardware is sane,
> there is no race at all - assumed that the driver/platform developer
> got it right. It's not rocket science to transition from "normal" irq
> delivery to wakeup based delivery raceless (except for PC style x86
> hardware of today)
>
> If "suspend" is the thing we are used to via /sys/power/state then the
> race will persist forever except for the suspend blocker workaround,
> which we can express in QoS terms as well w/o adding another suspend
> related user space API.

I'm not interested in adding another user-space API if it can possibly be
avoided, and I think it can. But that is a later step in the process.

I think you have acknowledged that there is a race with suspend - thanks.
Next step was "can it be closed".
You seem to suggest that it can, but you describe it as a "work around"
rather than a "bug fix"...

Do you agree that the race is a "bug", and therefore it is appropriate to
"fix" it assuming an acceptable fix can be found (which I think it can)?

If you agree that it is appropriate for try to fix this bug, then the next
step would be to get the Android devs to agree that a fix could - in
principle - address the need for which they created suspend-blockers.
Arve: can you confirm that?

Then, with a clear and agreed goal, we can look at possible fixes.

Thanks,
NeilBrown

>
> Thanks,
>
> tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Neil Brown on
On Mon, 31 May 2010 22:10:00 -0400 (EDT)
Alan Stern <stern(a)rowland.harvard.edu> wrote:

> On Tue, 1 Jun 2010, Neil Brown wrote:
>
> > > As I said before, we generally can't prevent such things from happening,
> > > because even if we handle the particular race described above, it still is
> > > possible that the event will be "lost" if it arrives just a bit later (eg.
> > > during a suspend-toggle switch). So the PRE_SUSPEND thing won't really
> > > solve the entire problem while increasing complexity.
> > >
> > > > My freerunner has a single core so without CONFIG_PREEMPT it may be that
> > > > there is no actual race-window - maybe the PRE_SUSPENDs will all run before a
> > > > soft_irq thread has a chance to finish handling of the interrupt (my
> > > > knowledge of these details is limits). But on a muilti-core device I think
> > > > there would definitely be a race-window.
> > >
> > > Yes, there always will be a race window. The only thing we can do is to
> > > narrow it, but we cannot really close it (at least not on a PC, but I'm not
> > > really sure it can be closed at all).
> >
> > Well you are the expert so I assume you are right, but I would really like to
> > understand why this is.
> >
> > I guess that if the event was delivered as an edge-triggered interrupt which
> > the interrupt controller couldn't latch, then it might get lost. Is that
> > what happens?
>
> You're barking up the wrong tree. If I understand correctly, Rafael is
> saying that there's a race involving events which are not _wakeup_
> events. If a non-wakeup event arrives shortly before a suspend, it can
> have its normal effect. If it arrives while a suspend is in progress,
> its delivery may be delayed until the system resumes. And if it
> arrives after the system is fully suspended, it may never be noticed at
> all.

I am only thinking of wakeup event. (And I am only thinking of suspend as
initiated by writing 'mem' to /sys/power/state.)
I agree that non-wakeup events can easily be delayed by suspend, and this is
by design and no-one wants it to change.

>
> With wakeup events the problem isn't so bad. Wakeup events are always
> noticed, and if the system is designed properly they will either abort
> a suspend-in-progress or else cause the system to resume as soon as the
> suspend is complete. (Linux is not yet properly designed in this
> sense.)

This is exactly the issue. Linux doesn't get this right. A wakeup that is
processed by the driver just before suspend may not get all the way out to
it's final destination (possibly in meat-space) until after some other event
wakes the device. It is not possible to guarantee full delivery of a wakeup
event with Linux in its current state. This is a bug.

>
> Or maybe I'm misunderstanding also, and Rafael is saying that there's
> a race involving events whose meaning changes depending on whether or
> not the system is asleep. This is obviously true and clearly
> unavoidable.
>
> > And if you are right that the race window cannot be closed, then the whole
> > suspend-blocker infrastructure is pointless as the purpose of it is simply to
> > close that window. If it really does not and cannot work, then it would be
> > best to reject it for that reason rather than for less concrete aesthetic
> > arguments.
> > But presumably it does work in practice on Android hardware so ..... confused.
>
> The point you're missing is that Android works with regard to wakeup
> events. It doesn't necessarily always receive non-wakeup events
> (although I don't know how Android classifies events -- maybe
> everything is a wakeup event for them).
>

I don't think I'm missing that. Wakeup events are the only events of
interest. But wakeup events are more than just low-level hardware events.
They are also the virtual events that are a direct consequence of the
original hardware event.
A UART asserts 'data ready' which is routed to an IO interrupt line that
wakes the device. This is a wakeup event.
So are the bits that appear on the data line which represent the data
So are the characters that appear on the serial port
So is the "here is a message from the GSM processor" event
So is the "there is a new TXT message" event
So is the "'beep beep beep' out the speaker" event

All these events are wakeup events, and should abort or instant-resume the
system. Currently only the first can be guaranteed to do this (... and maybe
the second, depending on the details of the implementation).
suspend-blocks effectively pass a lock from one event to the next to ensure
that all of these event block the suspend. There are various other ways to
achieve the same thing, and I think it can be done with no significant change
to the API. But some how we need to allow all of these events to be
linked wake-up events, not just the first one.

Thanks,
NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
2010/5/31 Rafael J. Wysocki <rjw(a)sisk.pl>:
> On Monday 31 May 2010, Arve Hj�nnev�g wrote:
>> 2010/5/30 Rafael J. Wysocki <rjw(a)sisk.pl>:
> ...
>>
>> I think it makes more sense to block suspend while wakeup events are
>> pending than blocking it everywhere timers are used by code that could
>> be called while handling wakeup events or other critical work. Also,
>> even if you did block suspend everywhere timers where used you still
>> have the race where a wakeup interrupt happens right after you decided
>> to suspend. In other words, you still need to block suspend in all the
>> same places as with the current opportunistic suspend code, so what is
>> the benefit of delaying suspend until idle?
>
> Assume for a while that you don't use suspend blockers, OK? �I realize you
> think that anything else doesn't make sense, but evidently some other people
> have that opinion about suspend blockers.
>

It sounded like you were suggesting that initiating suspend from idle
would somehow avoid the race condition with wakeup events. All I'm
saying is that you would need to block suspend in all the same places.
If you don't care about ignoring wakeup events, then sure you can
initiate suspend from idle.

> Now, under that assumption, I think it _generally_ is reasonable to make the
> system go into full suspend if everything (ie. CPUs and I/O) has been idle
> for sufficiently long time and there are no QoS requirements that aren't
> compatible with full system suspend.
>

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
On Mon, May 31, 2010 at 3:05 PM, Rafael J. Wysocki <rjw(a)sisk.pl> wrote:
> On Monday 31 May 2010, Neil Brown wrote:
>> On Thu, 27 May 2010 23:40:29 +0200 (CEST)
>> Thomas Gleixner <tglx(a)linutronix.de> wrote:
>>
>> > On Thu, 27 May 2010, Rafael J. Wysocki wrote:
>> >
>> > > On Thursday 27 May 2010, Thomas Gleixner wrote:
>> > > > On Thu, 27 May 2010, Alan Stern wrote:
>> > > >
>> > > > > On Thu, 27 May 2010, Felipe Balbi wrote:
>> > > > >
>> > > > > > On Thu, May 27, 2010 at 05:06:23PM +0200, ext Alan Stern wrote:
>> > > > > > >If people don't mind, here is a greatly simplified summary of the
>> > > > > > >comments and objections I have seen so far on this thread:
>> > > > > > >
>> > > > > > > � The in-kernel suspend blocker implementation is okay, even
>> > > > > > > � beneficial.
>> > > > > >
>> > > > > > I disagree here. I believe expressing that as QoS is much better. Let
>> > > > > > the kernel decide which power state is better as long as I can say I
>> > > > > > need 100us IRQ latency or 100ms wakeup latency.
>> > > > >
>> > > > > Does this mean you believe "echo mem >/sys/power/state" is bad and
>> > > > > should be removed? �Or "echo disk >/sys/power/state"? �They pay no
>> > > >
>> > > > mem should be replaced by an idle suspend to ram mechanism
>> > >
>> > > Well, what about when I want the machine to suspend _regardless_ of whether
>> > > or not it's idle at the moment? �That actually happens quite often to me. :-)
>> >
>> > Fair enough. Let's agree on a non ambigous terminology then:
>> >
>> > � � �forced:
>> >
>> > � � � � �suspend which you enforce via user interaction, which
>> > � � � � � � � � �also implies that you risk losing wakeups depending on
>> > � � � � � � � � �the hardware properties
>>
>> Reasonable definition I think. �However the current implementation doesn't
>> exactly match it.
>> With the current implementation you risk losing wakeups *independent* of the
>> hardware properties.
>
> Define "losing", please.
>
> Currently, we simply don't regard hardware signals occuring _during_ the
> suspend operation itself as wakeups (unless they are wakeup interrupts to be
> precise, because these _are_ taken into account by our current code).
>
> The reason is that the meaning of given event may be _different_ at run time
> and after the system has been suspended. �For example, consider a power button
> on a PC box. �If it's pressed at run time, it usually means "power off the
> system" to the kernel. �After the system has been suspended, however, it means
> "wake up". �So, you have to switch from one interpretation of the event to the
> other and that's not an atomic operaition (to put it lightly).
>
>> Even with ideal hardware events can be lost - by which I mean that they will
>> not be seen until some other event effects a wake-up.
>> e.g. the interrupt which signals the event happens immediately before the
>> suspend is requested (or maybe at the same time as), but the process which
>> needs to handle the event doesn't get a chance to see it before the suspend
>> procedure freezes that process, and even if it did it would have no way to
>> abort the suspend.
>>
>> So I submit that the current implementation doesn't match your description of
>> "forced", is therefore buggy, and that if it were fixed, that would be
>> sufficient to meet the immediate needs of android.
>
> I don't really think it may be fixed with respect to every possible kind of
> hardware. �On platforms where I/O interrupts are wakeup events it should
> work right now. �On other platforms it may be impossible to overcome hardware
> limitations.
>

There is no reason you cannot make the rtc alarms work reliably on x86
hardware. Even if you may loose key events while suspending I think it
is still valuable to have reliable alarms. I gave an example earlier
why reliable alarms are useful (dvr application).

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
On Mon, May 31, 2010 at 2:46 PM, Thomas Gleixner <tglx(a)linutronix.de> wrote:
> On Mon, 31 May 2010, James Bottomley wrote:
>>
>> For MSM hardware, it looks possible to unify the S and C states by doing
>> suspend to ram from idle but I'm not sure how much work that is.
>
> On ARM, it's not rocket science and we have in tree support for this
> already (OMAP). I have done the same thing on a Samsung part as a
> prove of concept two years ago and it's really easy as the hardware is
> sane. Hint: It's designed for mobile devices :)
>

We already enter the same power state from idle and suspend on msm. In
the absence of misbehaving apps, the difference in power consumption
is entirely caused by periodic timers in the user-space framework
_and_ kernel. It only takes a few timers triggering per second (I
think 3 if they do no work) to double the average power consumption on
the G1 if the radio is off. We originally added wakelocks because the
hardware we had at the time had much lower power consumption in
suspend then idle, but we still use suspend because it saves power.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/