Unify KVM kernel-space and user-space code into a single project [Kernel]

Prev: Irish 2010 Grant Winner
Next: [PATCH] staging: winbond: mds_f.h whitespace and CamelCase corrections.

From: Antoine Martin on 21 Mar 2010 16:40

On 03/22/2010 03:24 AM, Avi Kivity wrote:
> On 03/21/2010 10:18 PM, Antoine Martin wrote:
>>> That includes the guest kernel. If you can deploy a new kernel in
>>> the guest, presumably you can deploy a userspace package.
>>
>> That's not always true.
>> The host admin can control the guest kernel via "kvm -kernel" easily
>> enough, but he may or may not have access to the disk that is used in
>> the guest. (think encrypted disks, service agreements, etc)
>
> There is a matching -initrd argument that you can use to launch a daemon.
I thought this discussion was about making it easy to deploy... and
generating a custom initrd isn't easy by any means, and it requires
access to the guest filesystem (and its mkinitrd tools).
> I believe that -kernel use will be rare, though. It's a lot easier
> to keep everything in one filesystem.
Well, for what it's worth, I rarely ever use anything else. My virtual
disks are raw so I can loop mount them easily, and I can also switch my
guest kernels from outside... without ever needing to mount those disks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 21 Mar 2010 17:00

* Avi Kivity <avi(a)redhat.com> wrote:

> On 03/21/2010 09:06 PM, Ingo Molnar wrote:
> >* Avi Kivity<avi(a)redhat.com> wrote:
> >
> >>>>[...] Second, from my point of view all contributors are volunteers
> >>>>(perhaps their employer volunteered them, but there's no difference from
> >>>>my perspective). Asking them to repaint my apartment as a condition to
> >>>>get a patch applied is abuse. If a patch is good, it gets applied.
> >>>This is one of the weirdest arguments i've seen in this thread. Almost all
> >>>the time do we make contributions conditional on the general shape of the
> >>>project. Developers dont get to do just the fun stuff.
> >>So, do you think a reply to a patch along the lines of
> >>
> >> NAK. Improving scalability is pointless while we don't have a decent GUI.
> >>I'll review you RCU patches
> >> _after_ you've contributed a usable GUI.
> >>
> >>?
> >What does this have to do with RCU?
>
> The example was rcuifying kvm which took place a bit ago. Sorry, it wasn't
> clear.
>
> > I'm talking about KVM, which is a Linux kernel feature that is useless
> > without a proper, KVM-specific app making use of it.
> >
> > RCU is a general kernel performance feature that works across the board.
> > It helps KVM indirectly, and it helps many other kernel subsystems as
> > well. It needs no user-space tool to be useful.
>
> Correct. So should I tell someone that has sent a patch that rcu-ified kvm
> in order to scale it, that I won't accept the patch unless they do some
> usability userspace work? say, implementing an eject button. That's what I
> understood you to mean.

Of course you could say the following:

' Thanks, I'll mark this for v2.6.36 integration. Note that we are not
able to add this to the v2.6.35 kernel queue anymore as the ongoing
usability work already takes up all of the project's maintainer and
testing bandwidth. If you want the feature to be merged sooner than that
then please help us cut down on the TODO and BUGS list that can be found
at XYZ. There's quite a few low hanging fruits there. '

Although this RCU example is 'worst' possible example, as it's a pure speedup
change with no functionality effect.

Consider the _other_ examples that are a lot more clear:

' If you expose paravirt spilocks via KVM please also make sure the KVM
tooling can make use of it, has an option for it to configure it, and
that it has sufficient efficiency statistics displayed in the tool for
admins to monitor.'

' If you create this new paravirt driver then please also make sure it can
be configured in the tooling. '

' Please also add a testcase for this bug to tools/kvm/testcases/ so we dont
repeat this same mistake in the future. '

I'd say most of the high-level feature work in KVM has tooling impact.

And note the important arguement that the 'eject button' thing would not occur
naturally in a project that is well designed and has a good quality balance.
It would only occur in the transitionary period if a big lump of lower-quality
code is unified with higher-quality code. Then indeed a lot of pressure gets
created on the people working on the high-quality portion to go over and fix
the low-quality portion.

Which, btw., is an unconditonally good thing ...

But even an RCU speedup can be fairly linked/ordered to more pressing needs of
a project.

Really, the unification of two tightly related pieces of code has numerous
clear advantages. Please give it some thought before rejecting it.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 21 Mar 2010 17:10

* Avi Kivity <avi(a)redhat.com> wrote:

> On 03/21/2010 09:59 PM, Ingo Molnar wrote:
> >
> >Frankly, i was surprised (and taken slightly off base) by both Avi and Anthony
> >suggesting such a clearly inferior "add a demon to the guest space" solution.
> >It's a usability and deployment non-starter.
>
> It's only clearly inferior if you ignore every consideration against it.
> It's definitely not a deployment non-starter, see the tons of daemons that
> come with any Linux system. [...]

Avi, please dont put arguments into my mouth that i never made.

My (clearly expressed) argument was that:

_a new guest-side demon is a transparent instrumentation non-starter_

What is so hard to understand about that simple concept? Instrumentation is
good if it's as transparent as possible.

Of course lots of other features can be done via a new user-space package ...

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Avi Kivity on 21 Mar 2010 17:10

On 03/21/2010 10:31 PM, Antoine Martin wrote:
> On 03/22/2010 03:24 AM, Avi Kivity wrote:
>> On 03/21/2010 10:18 PM, Antoine Martin wrote:
>>>> That includes the guest kernel. If you can deploy a new kernel in
>>>> the guest, presumably you can deploy a userspace package.
>>>
>>> That's not always true.
>>> The host admin can control the guest kernel via "kvm -kernel" easily
>>> enough, but he may or may not have access to the disk that is used
>>> in the guest. (think encrypted disks, service agreements, etc)
>>
>> There is a matching -initrd argument that you can use to launch a
>> daemon.
> I thought this discussion was about making it easy to deploy... and
> generating a custom initrd isn't easy by any means, and it requires
> access to the guest filesystem (and its mkinitrd tools).

That's true. You need to run mkinitrd anyway, though, unless your guest
is non-modular and non-lvm.

>> I believe that -kernel use will be rare, though. It's a lot easier
>> to keep everything in one filesystem.
> Well, for what it's worth, I rarely ever use anything else. My virtual
> disks are raw so I can loop mount them easily, and I can also switch
> my guest kernels from outside... without ever needing to mount those
> disks.

Curious, what do you use them for?

btw, if you build your kernel outside the guest, then you already have
access to all its symbols, without needing anything further.

--
Do not meddle in the internals of kernels, for they are subtle and quick to panic.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 21 Mar 2010 17:30

* Avi Kivity <avi(a)redhat.com> wrote:

> > Well, for what it's worth, I rarely ever use anything else. My virtual
> > disks are raw so I can loop mount them easily, and I can also switch my
> > guest kernels from outside... without ever needing to mount those disks.
>
> Curious, what do you use them for?
>
> btw, if you build your kernel outside the guest, then you already have
> access to all its symbols, without needing anything further.

There's two errors with your argument:

1) you are assuming that it's only about kernel symbols

Look at this 'perf report' output:

# Samples: 7127509216
#
# Overhead Command Shared Object Symbol
# ........ .......... ............................. ......
#
19.14% git git [.] lookup_object
15.16% perf git [.] lookup_object
4.74% perf libz.so.1.2.3 [.] inflate
4.52% git libz.so.1.2.3 [.] inflate
4.21% perf libz.so.1.2.3 [.] inflate_table
3.94% git libz.so.1.2.3 [.] inflate_table
3.29% git git [.] find_pack_entry_one
3.24% git libz.so.1.2.3 [.] inflate_fast
2.96% perf libz.so.1.2.3 [.] inflate_fast
2.96% git git [.] decode_tree_entry
2.80% perf libc-2.11.90.so [.] __strlen_sse42
2.56% git libc-2.11.90.so [.] __strlen_sse42
1.98% perf libc-2.11.90.so [.] __GI_memcpy
1.71% perf git [.] decode_tree_entry
1.53% git libc-2.11.90.so [.] __GI_memcpy
1.48% git git [.] lookup_blob
1.30% git git [.] process_tree
1.30% perf git [.] process_tree
0.90% perf git [.] tree_entry
0.82% perf git [.] lookup_blob
0.78% git [kernel.kallsyms] [k] kstat_irqs_cpu

kernel symbols are only a small portion of the symbols. (a single line in this
case)

To get to those other symbols we have to read the ELF symbols of those
binaries in the guest filesystem, in the post-processing/reporting phase. This
is both complex to do and relatively slow so we dont want to (and cannot) do
this at sample time from IRQ context or NMI context ...

Also, many aspects of reporting are interactive so it's done lazily or
on-demand. So we need ready access to the guest filesystem - for those guests
which decide to integrate with the host for this.

2) the 'SystemTap mistake'

You are assuming that the symbols of the kernel when it got built got saved
properly and are discoverable easily. In reality those symbols can be erased
by a make clean, can be modified by a new build, can be misplaced and can
generally be hard to find because each distro puts them in a different
installation path.

My 10+ years experience with kernel instrumentation solutions is that
kernel-driven, self-sufficient, robust, trustable, well-enumerated sources of
information work far better in practice.

The thing is, in this thread i'm forced to repeat the same basic facts again
and again. Could you _PLEASE_, pretty please, when it comes to instrumentation
details, at least _read the mails_ of the guys who actually ... write and
maintain Linux instrumentation code? This is getting ridiculous really.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27
Prev: Irish 2010 Grant Winner
Next: [PATCH] staging: winbond: mds_f.h whitespace and CamelCase corrections.