From: Matthew Garrett on
On Thu, May 13, 2010 at 02:34:55PM -0700, Tony Lindgren wrote:
> * Matthew Garrett <mjg(a)redhat.com> [100513 14:16]:
> > What race-free mechanism do you use to ensure that? It's very easy to
> > handwave these problems away. It's very difficult to actually write an
> > implementation that works.
>
> Can you describe where do you see the race now?

1) Trusted app decides to suspend
2) Network packet that would otherwise wake the system is received
3) Trusted app sends SIGSTOP to untrusted userspace
4) Network packet sits waiting for stopped userspace to process it

Unless the trusted userspace gets woken up on every event that would
potentially cause a wakeup, you're racy. And the alternative involves an
extra userspace wakeup for every network packet - which is expensive.

--
Matthew Garrett | mjg59(a)srcf.ucam.org
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Alan Stern on
On Fri, 14 May 2010, Brian Swetland wrote:

> In tickless mode, the time until next timer is a signed int, so the
> longest the kernel will ever sleep is ~2 seconds at a go. In
> practice, userspace entities often have polling behavior that can
> trigger more often than that, and I've observed some kernel periodic
> timers (haven't cataloged them recently) that happen more often than
> once a second.

Paul and Kevin, how does the OMAP implementation handle these
difficulties?

Also, how does it handle the issue of ill-behaved apps?

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rafael J. Wysocki on
On Saturday 15 May 2010, Kevin Hilman wrote:
> "Rafael J. Wysocki" <rjw(a)sisk.pl> writes:
>
> > On Friday 14 May 2010, Kevin Hilman wrote:
> >> Kevin Hilman <khilman(a)deeprootsystems.com> writes:
> >>
> >> > "Rafael J. Wysocki" <rjw(a)sisk.pl> writes:
> >> >
> >> >> On Thursday 13 May 2010, Tony Lindgren wrote:
> >> >>> * Rafael J. Wysocki <rjw(a)sisk.pl> [100513 14:16]:
> >> >
> >> > [...]
> >> >
> >> >>>
> >> >>> > It solves a practical issue that _at_ _the_ _moment_ cannot be solved
> >> >>> > differently, while there's a growing number of out-of-tree drivers depending
> >> >>> > on this framework. We need those drivers in and because we don't have any
> >> >>> > viable alternative at hand, we have no good reason to reject it.
> >> >>>
> >> >>> Nothing is preventing merging the drivers can be merged without
> >> >>> these calls.
> >> >>
> >> >> And yet, there _is_ a growing nuber of drivers that don't get merge because
> >> >> of that. That's _reality_. Are you going to discuss with facts, or what?
> >> >
> >> > It may be reality, but IMO, "preventing other drivers" isn't a good
> >> > *technical* argument for merging a feature. It feels like these "for
> >> > the 'good' of the community" arguments are being used to trump the
> >> > technical arguments. Maybe we need to keep the separate.
> >>
> >> To continue along the "for the good of the community" path...
> >>
> >> If it truly is the lack of a suspend blocker API that is preventing
> >> the merge of these out of tree drivers, I second Mark's proposal[1] to
> >> merge a noop version of the API while the technical issues continue to
> >> be discussed.
> >
> > I'm against that, sorry.
>
> OK, I'll bite... Why?

Because in that case the real feature will always be opposed as "unnecessary"
and never merged. I very much prefer to decide whether to merge it or reject
it right now.

> >> Then we would see how many drivers get submitted and merged.
> >>
> >> Personally, I suspect that lack of this feature is not the real
> >> obstacle to getting these out-of-tree drivers upstream. Having this
> >> API upstream will not change the product schedules and corporate
> >> cultures that have prevented code from making its way upstream.
> >
> > But apparently it is considered as a suitable excuse.
>
> No, it is not a _technical_ excuse. Just a healthy, experience-based
> dose of skepticism.

I didn't mean that, actually. What I wanted to say is that people use the
lack of "wakelocks" in the mainline as an excuse not to push their drivers
upstream (which is kind of understandable, because there's a zero benefit to
them from mainlining their code, as they will have to maintain a separate
"Android" version of it anyway).

Sorry for the confusion.

> It was expressed because I find the arguments above for merging
> because it prevents out-of-tree drivers from merging quite
> unconvincing. This is not just about opportunistic suspend + suspend
> blockers specifically but comes from several years experience in the
> embedded Linux world.

In this particular case, the lack of the mainline's support for opportunistic
suspend (in the form of "wakelocks" to be precise) has been given as a main
obstacle against merging of several drivers at least.

So, let's not just easily generalize, please.

And this is not the only reason to push the opportunistic suspend feature
upstream IMO.

First, I think it is a legitimate approach to power management, whether you
like it or not. I haven't seen anyone seriously arguing against that yet.

Second, (as said above) there is a number of drivers _already_ depending
on it and that number is only going to grow given the popularity of Android.
They are not mainlined in part because their authors don't see a benefit from
doing so (usually the benefit is that you don't have to maintain your code out
of the tree, because the mainline does it for you to some extent, so if you
need to maintain a separate version yourself, the benefit is zero).

Next, the only thing the Arve's patches do is to give people an _option_ to use
opportunistic suspend if they need it or want it. It's not mandatory and not
even enabled by default, so I don't really see what the problem is. Quite on
the contrary, I'd like people to be able to use the mainline on Android systems
without major modifications, because potentially that can increase our tester
base (and developer base too in consequence) and including opportunistic
suspend in the mainline would be a step in that direction.

Finally, it appears to address some issues that at the moment we don't
seriously know how to address in a different way.

Thanks,
Rafael

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mark Brown on
On Sat, 2010-05-15 at 22:14 +0200, Rafael J. Wysocki wrote:
> On Saturday 15 May 2010, Kevin Hilman wrote:
> > "Rafael J. Wysocki" <rjw(a)sisk.pl> writes:
> > > On Friday 14 May 2010, Kevin Hilman wrote:

> > >> If it truly is the lack of a suspend blocker API that is preventing
> > >> the merge of these out of tree drivers, I second Mark's proposal[1] to
> > >> merge a noop version of the API while the technical issues continue to
> > >> be discussed.

> > > I'm against that, sorry.

> > OK, I'll bite... Why?

> Because in that case the real feature will always be opposed as "unnecessary"
> and never merged. I very much prefer to decide whether to merge it or reject
> it right now.

FWIW (as the person who made the suggestion) I do think that it is
something one might call expedient rather than actually good.

> > It was expressed because I find the arguments above for merging
> > because it prevents out-of-tree drivers from merging quite
> > unconvincing. This is not just about opportunistic suspend + suspend
> > blockers specifically but comes from several years experience in the
> > embedded Linux world.

> In this particular case, the lack of the mainline's support for opportunistic
> suspend (in the form of "wakelocks" to be precise) has been given as a main
> obstacle against merging of several drivers at least.

Where are these objections coming from? The only example I've seen cited
is the G1 stuff, which is a fairly special case for a number of reasons
(including the underlying MSM BSP, which was pretty substantial itself).

> And this is not the only reason to push the opportunistic suspend feature
> upstream IMO.

Agreed, my purpose here is mostly to push back on what sound like
unrealistic expectations about what we're getting.

> Second, (as said above) there is a number of drivers _already_ depending
> on it and that number is only going to grow given the popularity of Android.
> They are not mainlined in part because their authors don't see a benefit from
> doing so (usually the benefit is that you don't have to maintain your code out
> of the tree, because the mainline does it for you to some extent, so if you
> need to maintain a separate version yourself, the benefit is zero).

Wakelocks are going to be a fairly minor part of any decision here -
it'd be pretty surprising if they take much effort to remove from a
driver. What's much more of an issue is that you've got essentially the
same situation as you have with the enterprise Linux distributions, a
fixed kernel version that vendors need to target. The differences that
implies are far more substantial than wakelocks for many areas of the
kernel, especially at the points in the cycle where the fixed kernel has
drifted furthest from the current mainline.

Things aren't quite the same as with the enterprise distributions,
though - the lifecycles for many parts in the consumer space are much,
much shorter than those in the enterprise markets. They can be
sufficiently short to mean that a mainline driver won't show up where
customers need it soon enough. For example, Google is currently
preparing an Android release based on 2.6.32 the merge window for which
was in September last year, over six months ago. This is an extremely
long latency if you're working on something in the region of a twelve
month cycle. The lack of standardisation for register interfaces in the
embedded space means that the generation to generation differences can
easily be sufficiently substantial to make a new driver the only
sensible option.

This isn't to say that the old parts just suddenly vanish, and clearly
there's an advantage from ongoing mainline inclusion, but the tradeoffs
are a bit different to those in other markets.

There's also a less pressure from end users towards mainline inclusion -
even on Linux people in the embedded space are used to having to get
code from multiple vendors working together so the lack of mainline
support isn't the sort of issue it would be with something like server
class hardware. This is changing over time as more and more vendors buy
into mainline but there's a way to go yet. If the part requires changes
outside the driver itself (a new or updated kernel interface, for
example) there's a pressure to just deal with it in the driver in a way
which is going to be unacceptable for mainline, possibly even involving
per-system code modifications in the driver. Sometimes ongoing mainline
development will mean that the driver needs updating anyway in ways that
are much more substantial than ripping out wakelocks would be.

Having wakelocks in does make things easier for drivers using them but
we need to recognise that all sorts of other things that are much harder
to deal with will also come up.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Kevin Hilman on
Alan Stern <stern(a)rowland.harvard.edu> writes:

> On Fri, 14 May 2010, Brian Swetland wrote:
>
>> In tickless mode, the time until next timer is a signed int, so the
>> longest the kernel will ever sleep is ~2 seconds at a go. In
>> practice, userspace entities often have polling behavior that can
>> trigger more often than that, and I've observed some kernel periodic
>> timers (haven't cataloged them recently) that happen more often than
>> once a second.
>
> Paul and Kevin, how does the OMAP implementation handle these
> difficulties?

just a minor clarification... these aren't OMAP-specific issues, but
generic issues to all power-sensitive kernel users.

The ~2 second limit was fixed by Jon Hunter (TI) and is in mainline
since 2.6.32[1].

For other timers, there has been active work (mostly by Intel folks)
on deferrable timers, coalescing timers, timer slack etc. that has
greatly reduced the kernel timer impact on wakeups.

> Also, how does it handle the issue of ill-behaved apps?

For userspace, apps that have polling behavior or are ill-behaved must
be found and fixed. Thanks to tools like powertop, this is a farily
easy task.

But really, I don't consider the "ill-behaved app" problem to be a
real-world problem. Both in maemo/meego and Android, if someone
writes an app that kills battery life, it will get reported as a bug,
or get bad ratings etc. On these kinds of devices, there is a *stong*
developer incentive to not write battery sucking apps.

Kevin


[1]
commit 97813f2fe77804a4464564c75ba8d8826377feea
Author: Jon Hunter <jon-hunter(a)ti.com>
Date: Tue Aug 18 12:45:11 2009 -0500

nohz: Allow 32-bit machines to sleep for more than 2.15 seconds

In the dynamic tick code, "max_delta_ns" (member of the
"clock_event_device" structure) represents the maximum sleep time
that can occur between timer events in nanoseconds.

The variable, "max_delta_ns", is defined as an unsigned long
which is a 32-bit integer for 32-bit machines and a 64-bit
integer for 64-bit machines (if -m64 option is used for gcc).
The value of max_delta_ns is set by calling the function
"clockevent_delta2ns()" which returns a maximum value of LONG_MAX.
For a 32-bit machine LONG_MAX is equal to 0x7fffffff and in
nanoseconds this equates to ~2.15 seconds. Hence, the maximum
sleep time for a 32-bit machine is ~2.15 seconds, where as for
a 64-bit machine it will be many years.

This patch changes the type of max_delta_ns to be "u64" instead of
"unsigned long" so that this variable is a 64-bit type for both 32-bit
and 64-bit machines. It also changes the maximum value returned by
clockevent_delta2ns() to KTIME_MAX. Hence this allows a 32-bit
machine to sleep for longer than ~2.15 seconds. Please note that this
patch also changes "min_delta_ns" to be "u64" too and although this is
unnecessary, it makes the patch simpler as it avoids to fixup all
callers of clockevent_delta2ns().

[ tglx: changed "unsigned long long" to u64 as we use this data type
through out the time code ]

Signed-off-by: Jon Hunter <jon-hunter(a)ti.com>
Cc: John Stultz <johnstul(a)us.ibm.com>
LKML-Reference: <1250617512-23567-3-git-send-email-jon-hunter(a)ti.com>
Signed-off-by: Thomas Gleixner <tglx(a)linutronix.de>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/