Unify KVM kernel-space and user-space code into a single project [Kernel]

Prev: Irish 2010 Grant Winner
Next: [PATCH] staging: winbond: mds_f.h whitespace and CamelCase corrections.

From: Anthony Liguori on 22 Mar 2010 12:10

On 03/22/2010 10:55 AM, Ingo Molnar wrote:
> * Anthony Liguori<anthony(a)codemonkey.ws> wrote:
>
>
>> [...]
>>
>> I've been trying very hard to turn this into a productive thread attempting
>> to capture your feedback and give clear suggestions about how you can solve
>> achieve your desired functionality.
>>
> I'm glad that we are at this more productive stage. I'm still trying to
> achieve the very same technological capabilities that i expressed in the first
> few mails when i reviewed the 'perf kvm' patch that was submitted by Yanmin.
>
> The crux of the problem is very simple. To quote my earlier mail:
>
> |
> | - The inconvenience of having to type:
> | perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \
> | --guestmodules=/home/ymzhang/guest/modules top
> |
> |
> | is very obvious even with a single guest. Now multiply that by more guests ...
> |
>
> For example we want 'perf kvm top' to do something useful by default: it
> should find the first guest running and it should report its profile.
>
> The tool shouldnt have to guess about where the guests are, what their
> namespaces is and how to talk to them. We also want easy symbolic access to
> guest, for example:
>
> perf kvm -g OpenSuse-2 record sleep 1
>

Two things are needed. The first thing needed is to be able to
enumerate running guests and identify a symbolic name. I have a patch
for this and it'll be posted this week or so. perf will need to have a
QMP client and it will need to look in ${HOME}/.qemu/qmp/ to sockets to
connect to.

This is too much to expect from a client and we've got a GSoC idea
posted to make a nice library for tools to use to simplify this.

The sockets are named based on UUID and you'll have to connect to a
guest and ask it for it's name. Some guests don't have names so we'll
have to come up with a clever way to describe a nameless VM.

> I.e.:
>
> - Easy default reference to guest instances, and a way for tools to
> reference them symbolically as well in the multi-guest case. Preferably
> something trustable and kernel-provided - not some indirect information
> like a PID file created by libvirt-manager or so.
>

A guest is not a KVM concept. It's a qemu concept so it needs to be
something provided by qemu. The other caveat is that you won't see
guests created by libvirt because we're implementing this in terms of a
default QMP device and libvirt will disable defaults. This is desired
behaviour. libvirt wants to be in complete control and doesn't want a
tool like perf interacting with a guest directly.

> - Guest-transparent VFS integration into the host, to recover symbols and
> debug info in binaries, etc.
>

The way I'd like to see this implemented is a guest userspace daemon. I
think having the guest userspace daemon be something that can be updated
by the host is reasonable.

In terms of exposing that on the host, my preferred approach is QMP.
I'd be happy with a QMP command that is essentially,
guest_fs_read(filename) and guest_fd_readdir(path).

If desired, one could implement a fuse filesystem that interacted with
all local qemu instances to expose this on the host. There's a lot of
ugly things about fuse though so I think sticking to QMP is best
(particularly with respect to root access of a fuse filesystem).

With just those couple things in place, perf should be able to do
exactly what you want it to do.

Regards,

Anthony Liguroi

> There were a few responses to that but none really addressed those problems -
> they mostly tried to re-define the problem and suggested that i was wrong to
> want such capabilities and suggested various inferior approaches instead. See
> the thread for the details - i think i covered every technical suggestion that
> was made.
>
> So we are still at an impasse as far as i can see. If i overlooked some
> suggestion that addresses these problems then please let me know ...
>
> Thanks,
>
> Ingo
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Avi Kivity on 22 Mar 2010 12:20

On 03/22/2010 06:08 PM, Ingo Molnar wrote:
> * Avi Kivity<avi(a)redhat.com> wrote:
>
>
>> On 03/22/2010 04:32 PM, Ingo Molnar wrote:
>>
>>> * Avi Kivity<avi(a)redhat.com> wrote:
>>>
>>>
>>>> On 03/22/2010 02:44 PM, Ingo Molnar wrote:
>>>>
>>>>> This is why i consider that line of argument rather dishonest ...
>>>>>
>>>> I am not going to reply to any more email from you on this thread.
>>>>
>>> Because i pointed out that i consider a line of argument intellectually
>>> dishonest?
>>>
>>> I did not say _you_ as a person are dishonest - doing that would be an ad
>>> honimen attack against your person. (In fact i dont think you are, to the
>>> contrary)
>>>
>>> An argument can certainly be labeled dishonest in a fair discussion and it
>>> is not a personal attack against you to express my opinion about that.
>>>
>>>
>> Sigh, why am I drawn into this.
>>
>> A person who uses dishonest arguments is a dishonest person. [...]
>>
> That's not how i understood that phrase - and i did not mean to suggest that
> you are dishonest and i do not think that you are dishonest (to the contrary).
>

Word games.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Avi Kivity on 22 Mar 2010 12:20

On 03/22/2010 05:55 PM, Ingo Molnar wrote:
> * Anthony Liguori<anthony(a)codemonkey.ws> wrote:
>
>
>> [...]
>>
>> I've been trying very hard to turn this into a productive thread attempting
>> to capture your feedback and give clear suggestions about how you can solve
>> achieve your desired functionality.
>>
> I'm glad that we are at this more productive stage. I'm still trying to
> achieve the very same technological capabilities that i expressed in the first
> few mails when i reviewed the 'perf kvm' patch that was submitted by Yanmin.
>

No, you're not. You're trying to fracture the qemu community with your
tools/kvm proposal, you're explaining to me how I'm working on the wrong
thing by concentrating on things that my employer needs rather than what
you think kvm needs, and attaching various unsavoury labels to Anthony
and myself. Any wonder we aren't getting anything done?

If you can commit to a reasonable conversation we might be able to make
progress. Is this actually possible?

> The crux of the problem is very simple. To quote my earlier mail:
>
> |
> | - The inconvenience of having to type:
> | perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \
> | --guestmodules=/home/ymzhang/guest/modules top
> |
> |
> | is very obvious even with a single guest. Now multiply that by more guests ...
> |
>
> For example we want 'perf kvm top' to do something useful by default: it
> should find the first guest running and it should report its profile.
>
> The tool shouldnt have to guess about where the guests are, what their
> namespaces is and how to talk to them. We also want easy symbolic access to
> guest, for example:
>
> perf kvm -g OpenSuse-2 record sleep 1
>
> I.e.:
>
> - Easy default reference to guest instances, and a way for tools to
> reference them symbolically as well in the multi-guest case. Preferably
> something trustable and kernel-provided - not some indirect information
> like a PID file created by libvirt-manager or so.
>

Usually 'layering violation' is trotted out at such suggestions. I
don't like using the term, because sometimes the layers are incorrect
and need to be violated. But it should be done explicitly, not as a
shortcut for a minor feature (and profiling is a minor feature, most
users will never use it, especially guest-from-host).

The fact is we have well defined layers today, kvm virtualizes the cpu
and memory, qemu emulates devices for a single guest, libvirt manages
guests. We break this sometimes but there has to be a good reason. So
perf needs to talk to libvirt if it wants names. Could be done via
linking, or can be done using a pluging libvirt drops into perf.

> - Guest-transparent VFS integration into the host, to recover symbols and
> debug info in binaries, etc.
>
> There were a few responses to that but none really addressed those problems -
> they mostly tried to re-define the problem and suggested that i was wrong to
> want such capabilities and suggested various inferior approaches instead. See
> the thread for the details - i think i covered every technical suggestion that
> was made.
>

You simply kept ignoring me when I said that if something can be kept
out of the kernel without impacting performance, it should be. I don't
want emergency patches closing some security hole or oops in a kernel
symbol server.

The usability argument is a red herring. True, it takes time for things
to trickle down to distributions and users. Those who can't wait can
download the code and compile, it isn't that difficult.

> So we are still at an impasse as far as i can see. If i overlooked some
> suggestion that addresses these problems then please let me know ...
>

The impasse is mostly due to you insisting on doing everything your way,
in the kernel, and disregarding how libvirt/qemu/kvm does things. Learn
the kvm ecosystem, you'll find it is quite easy to contribute code.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 22 Mar 2010 12:40

* Joerg Roedel <joro(a)8bytes.org> wrote:

> [...] Look at the state of the alpha arch in Linux today, it is maintained
> in one repository but nobody really cares about it. Thus it is miles behine
> most other archs Linux supports today in quality and feature completeness.

I dont know how you can find the situation of Alpha comparable, which is a
legacy architecture for which no new CPU was manufactored in the past ~10
years.

The negative effects of physical obscolescence cannot be overcome even by the
very best of development models ...

So this is a total non-argument in this context.

> On Mon, Mar 22, 2010 at 01:22:28PM +0100, Ingo Molnar wrote:
> >
> > * Joerg Roedel <joro(a)8bytes.org> wrote:
> >
> > > [...] Basically the reason of the oProfile failure is a disfunctional
> > > community. [...]
> >
> > Caused by: repository separation and the inevitable code and social fork a
> > decade later.
>
> No, the split-repository situation was the smallest problem after all. Its
> was a community thing. If the community doesn't work a single-repo project
> will also fail. [...]

So, what do you think creates code communities and keeps them alive?
Developers and code. And the wellbeing of developers are primarily influenced
by the repository structure and by the development/maintenance process - i.e.
by the 'fun' aspect. (i'm simplifying things there but that's the crux of it.)

So yes, i do claim that what stiffled and eventually killed off the Oprofile
community was the split repository. None of the other Oprofile shortcomings
were really unfixable, but this one was. It gave no way for the community to
grow in a healthy way, after the initial phase. Features were more difficult
and less fun to develop.

And yes, there were times when there was still active Oprofile development but
the development process warning signs should have been noticed, and the
community could have been kept alive by unification and similar measures.
Instead what happened was a complete rewrite and a competitive replacement by
perf. (Which isnt particularly nice to users btw. - they prefer more gradual
transitions - but there was no other option, so many problems accumulated in
Oprofile.)

I simply do not want to see KVM face the same fate, and yes i do see similar
warnings signs.

> > What you fail to realise (or what you fail to know, you werent around when
> > Oprofile was written, i was) is that Oprofile _did_ have a functional
> > single community when it was written. The tooling and the kernel bits was
> > written by the same people.
>
> Yes, this was probably the time when everybody was enthusiastic about the
> feature and they could attract lots of developers. But situation changed
> over time.

The thing is, the drift was pre-programmed by having a split ...

> > So i dont see much of a difference to the Oprofile situation really and i
> > see many parallels. I also see similar kinds of desktop usability
> > problems.
>
> The difference is that KVM has a working community with good developers and
> maintainers.

Oprofile certainly had good developers and maintainers as well. In the end it
wasnt enough ...

Also, a project can easily still be 'alive' but not reach its full potential.

Why do you assume that my argument means that KVM isnt viable today? It can
very well still be viable and even healthy - just not _as healthy_ as it could
be ...

> > The difference is that we dont have KVM with a decade of history and we
> > dont have a 'told you so' KVM reimplementation to show that proves the
> > point. I guess it's a matter of time before that happens, because Qemu
> > usability is so absymal today - so i guess we should suspend any
> > discussions until that happens, no need to waste time on arguing
> > hypoteticals.
>
> We actually have lguest which is small. But it lacks functionality and the
> developer community KVM has attracted.

I suggested long ago to merge lguest into KVM to cover non-VMX/non-SVM
execution.

> > I think you are rationalizing the status quo.
>
> I see that there are issues with KVM today in some areas. You pointed out
> the desktop usability already. I personally have trouble with the
> qem-kvm.git because it is unbisectable. But repository unification doesn't
> solve the problem here.

Why doesnt it solve the bisectability problem? The kernel repo is supposed to
be bisectable so that problem would be solved.

> The point for a single repository is that it simplifies the development
> process. I agree with you here. But the current process of KVM is not too
> difficult after all. I don't have to touch qemu sources for most of my work
> on KVM.

In my judgement you'd have to do that more frequently, if KVM was properly
weighting its priorities. For example regarding this recent KVM commit of
yours:

| commit ec1ff79084fccdae0dca9b04b89dcdf3235bbfa1
| Author: Joerg Roedel <joerg.roedel(a)amd.com>
| Date: Fri Oct 9 16:08:31 2009 +0200
|
| KVM: SVM: Add tracepoint for invlpga instruction
|
| This patch adds a tracepoint for the event that the guest
| executed the INVLPGA instruction.

With integrated KVM tooling i might have insisted for that new tracepoint to
be available to users as well via some more meaningful tooling than just a
pure tracepoint.

There's synergies like that all around the place.

You should realize that naturally developers will gravitate towards the most
'fun' aspects of a project. It is the task of the maintainer to keep the
balance between fun and utility, bugs and features, quality and code-rot.

> > It's as if you argued in 1990 that the unification of East and West
> > Germany wouldnt make much sense because despite clear problems and
> > incompatibilites and different styles westerners were still allowed to
> > visit eastern relatives and they both spoke the same language after all
> > ;-)
>
> Um, hmm. I don't think these situations have enough in common to compare
> them ;-)

Probably, but it's an interesting parallel nevertheless ;-)

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 22 Mar 2010 13:00

* Avi Kivity <avi(a)redhat.com> wrote:

> > The crux of the problem is very simple. To quote my earlier mail:
> >
> > |
> > | - The inconvenience of having to type:
> > | perf kvm --host --guest --guestkallsyms=/home/ymzhang/guest/kallsyms \
> > | --guestmodules=/home/ymzhang/guest/modules top
> > |
> > |
> > | is very obvious even with a single guest. Now multiply that by more guests ...
> > |
> >
> > For example we want 'perf kvm top' to do something useful by default: it
> > should find the first guest running and it should report its profile.
> >
> > The tool shouldnt have to guess about where the guests are, what their
> > namespaces is and how to talk to them. We also want easy symbolic access to
> > guest, for example:
> >
> > perf kvm -g OpenSuse-2 record sleep 1

[ Sidenote: i still received no adequate suggestions about how to provide this
category of technical features. ]

> > I.e.:
> >
> > - Easy default reference to guest instances, and a way for tools to
> > reference them symbolically as well in the multi-guest case. Preferably
> > something trustable and kernel-provided - not some indirect information
> > like a PID file created by libvirt-manager or so.
>
> Usually 'layering violation' is trotted out at such suggestions.
> [...]

That's weird, how can a feature request be a 'layering violation'?

If something that users find straightforward and usable is a layering
violation to you (such as easily being able to access their own files on the
host as well ...) then i think you need to revisit the definition of that term
instead of trying to fix the user.

> [...] I don't like using the term, because sometimes the layers are
> incorrect and need to be violated. But it should be done explicitly, not as
> a shortcut for a minor feature (and profiling is a minor feature, most users
> will never use it, especially guest-from-host).
>
> The fact is we have well defined layers today, kvm virtualizes the cpu and
> memory, qemu emulates devices for a single guest, libvirt manages guests.
> We break this sometimes but there has to be a good reason. So perf needs to
> talk to libvirt if it wants names. Could be done via linking, or can be
> done using a pluging libvirt drops into perf.
>
> > - Guest-transparent VFS integration into the host, to recover symbols and
> > debug info in binaries, etc.
> >
> > There were a few responses to that but none really addressed those
> > problems - they mostly tried to re-define the problem and suggested that i
> > was wrong to want such capabilities and suggested various inferior
> > approaches instead. See the thread for the details - i think i covered
> > every technical suggestion that was made.
>
> You simply kept ignoring me when I said that if something can be kept out of
> the kernel without impacting performance, it should be. I don't want
> emergency patches closing some security hole or oops in a kernel symbol
> server.

I never suggested an "in kernel space symbol server" which could oops, why
would i have suggested that? Please point me to an email where i suggested
that.

> The usability argument is a red herring. True, it takes time for things to
> trickle down to distributions and users. Those who can't wait can download
> the code and compile, it isn't that difficult.

It's not just "download and compile", it's also "configure correctly for
several separate major distributions" and "configure to per guest instance
local rules".

It's far more fragile in practice than you make it appear to be, and since you
yourself expressed that you are not interested much in the tooling side, how
can you have adequate experience to judge such matters?

In fact for instrumentation it's beyond a critical threshold of fragility -
instrumentation above all needs to be accessible, transparent and robust.

If you cannot see the advantages of a properly integrated solution then i
suspect there's not much i can do to convince you.

And you ignored not just me but you ignored several people in this thread who
thought the current status quo was inadequate and expressed interest in both
the VFS integration and in the guest enumeration features.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36
Prev: Irish 2010 Grant Winner
Next: [PATCH] staging: winbond: mds_f.h whitespace and CamelCase corrections.