Enhance perf to collect KVM guest os statistics from host side [Kernel]

Prev: + tmpfs-fix-oops-on-remounts-with-mpol=default.patch added to -mm tree
Next: [PATCH 5/5] doc: add the documentation for mpol=local

From: Avi Kivity on 16 Mar 2010 09:40

On 03/16/2010 03:31 PM, Ingo Molnar wrote:
>
>> You can do that through libvirt, but that only works for guests started
>> through libvirt. libvirt provides command-line tools to list and manage
>> guests (for example autostarting them on startup), and tools built on top of
>> libvirt can manage guests graphically.
>>
>> Looks like we have a layer inversion here. Maybe we need a plugin system -
>> libvirt drops a .so into perf that teaches it how to list guests and get
>> their symbols.
>>
> Is libvirt used to start up all KVM guests? If not, if it's only used on some
> distros while other distros have other solutions then there's apparently no
> good way to get to such information, and the kernel bits of KVM do not provide
> it.
>

Developers tend to start qemu from the command line, but the majority of
users and all distros I know of use libvirt. Some users cobble up their
own scripts.

> To the user (and to me) this looks like a KVM bug / missing feature. (and the
> user doesnt care where the blame is) If that is true then apparently the
> current KVM design has no technically actionable solution for certain
> categories of features!
>

A plugin system allows anyone who is interested to provide the
information; they just need to write a plugin for their management tool.

Since we can't prevent people from writing management tools, I don't see
what else we can do.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 16 Mar 2010 09:40

* Avi Kivity <avi(a)redhat.com> wrote:

> On 03/16/2010 03:08 PM, Ingo Molnar wrote:
> >
> >>>I mean, i can trust a kernel service and i can trust /proc/kallsyms.
> >>>
> >>>Can perf trust a random process claiming to be Qemu? What's the trust
> >>>mechanism here?
> >>Obviously you can't trust anything you get from a guest, no matter how you
> >>get it.
> >I'm not talking about the symbol strings and addresses, and the object
> >contents for allocation (or debuginfo). I'm talking about the basic protocol
> >of establishing which guest is which.
>
> There is none. So far, qemu only dealt with managing just its own
> guest, and left all multiple guest management to higher levels up
> the stack (like libvirt).
>
> >I.e. we really want to be able users to:
> >
> > 1) have it all working with a single guest, without having to specify 'which'
> > guest (qemu PID) to work with. That is the dominant usecase both for
> > developers and for a fair portion of testers.
>
> That's reasonable if we can get it working simply.

IMO such ease of use is reasonable and required, full stop.

If it cannot be gotten simply then that's a bug: either in the code, or in the
design, or in the development process that led to the design. Bugs need
fixing.

> > 2) Have some reasonable symbolic identification for guests. For example a
> > usable approach would be to have 'perf kvm list', which would list all
> > currently active guests:
> >
> > $ perf kvm list
> > [1] Fedora
> > [2] OpenSuse
> > [3] Windows-XP
> > [4] Windows-7
> >
> > And from that point on 'perf kvm -g OpenSuse record' would do the obvious
> > thing. Users will be able to just use the 'OpenSuse' symbolic name for
> > that guest, even if the guest got restarted and switched its main PID.
> >
> > Any such facility needs trusted enumeration and a protocol where i can
> > trust that the information i got is authorative. (I.e. 'OpenSuse' truly
> > matches to the OpenSuse session - not to some local user starting up a
> > Qemu instance that claims to be 'OpenSuse'.)
> >
> > Is such a scheme possible/available? I suspect all the KVM configuration
> > tools (i havent used them in some time - gui and command-line tools alike)
> > use similar methods to ease guest management?
>
> You can do that through libvirt, but that only works for guests started
> through libvirt. libvirt provides command-line tools to list and manage
> guests (for example autostarting them on startup), and tools built on top of
> libvirt can manage guests graphically.
>
> Looks like we have a layer inversion here. Maybe we need a plugin system -
> libvirt drops a .so into perf that teaches it how to list guests and get
> their symbols.

Is libvirt used to start up all KVM guests? If not, if it's only used on some
distros while other distros have other solutions then there's apparently no
good way to get to such information, and the kernel bits of KVM do not provide
it.

To the user (and to me) this looks like a KVM bug / missing feature. (and the
user doesnt care where the blame is) If that is true then apparently the
current KVM design has no technically actionable solution for certain
categories of features!

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 16 Mar 2010 12:00

* Frank Ch. Eigler <fche(a)redhat.com> wrote:

>
> Ingo Molnar <mingo(a)elte.hu> writes:
>
> > [...]
> >> >I.e. we really want to be able users to:
> >> >
> >> > 1) have it all working with a single guest, without having to specify 'which'
> >> > guest (qemu PID) to work with. That is the dominant usecase both for
> >> > developers and for a fair portion of testers.
> >>
> >> That's reasonable if we can get it working simply.
> >
> > IMO such ease of use is reasonable and required, full stop.
> > If it cannot be gotten simply then that's a bug: either in the code, or in the
> > design, or in the development process that led to the design. Bugs need
> > fixing. [...]
>
> Perhaps the fact that kvm happens to deal with an interesting application
> area (virtualization) is misleading here. As far as the host kernel or
> other host userspace is concerned, qemu is just some random unprivileged
> userspace program (with some *optional* /dev/kvm services that might happen
> to require temporary root).
>
> As such, perf trying to instrument qemu is no different than perf trying to
> instrument any other userspace widget. Therefore, expecting 'trusted
> enumeration' of instances is just as sensible as using 'trusted ps' and
> 'trusted /var/run/FOO.pid files'.

You are quite mistaken: KVM isnt really a 'random unprivileged application' in
this context, it is clearly an extension of system/kernel services.

( Which can be seen from the simple fact that what started the discussion was
'how do we get /proc/kallsyms from the guest'. I.e. an extension of the
existing host-space /proc/kallsyms was desired. )

In that sense the most natural 'extension' would be the solution i mentioned a
week or two ago: to have a (read only) mount of all guest filesystems, plus a
channel for profiling/tracing data. That would make symbol parsing easier and
it's what extends the existing 'host space' abstraction in the most natural
way.

( It doesnt even have to be done via the kernel - Qemu could implement that
via FUSE for example. )

As a second best option a 'symbol server' might be used too.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ingo Molnar on 16 Mar 2010 12:40

* Frank Ch. Eigler <fche(a)redhat.com> wrote:

> Hi -
>
> On Tue, Mar 16, 2010 at 04:52:21PM +0100, Ingo Molnar wrote:
> > [...]
> > > Perhaps the fact that kvm happens to deal with an interesting application
> > > area (virtualization) is misleading here. As far as the host kernel or
> > > other host userspace is concerned, qemu is just some random unprivileged
> > > userspace program [...]
>
> > You are quite mistaken: KVM isnt really a 'random unprivileged
> > application' in this context, it is clearly an extension of
> > system/kernel services.
>
> I don't know what "extension of system/kernel services" means in this
> context, beyond something running on the system/kernel, like every other
> process. [...]

It means something like my example of 'extended to guest space'
/proc/kallsyms:

> > [...]
> >
> > ( Which can be seen from the simple fact that what started the
> > discussion was 'how do we get /proc/kallsyms from the guest'. I.e. an
> > extension of the existing host-space /proc/kallsyms was desired. )
>
> (Sorry, that smacks of circular reasoning.)

To me it sounds like an example supporting my point. /proc/kallsyms is a
service by the kernel, and 'perf kvm' desires this to be extended to guest
space as well.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Anthony Liguori on 16 Mar 2010 13:20

On 03/16/2010 08:08 AM, Ingo Molnar wrote:
> * Avi Kivity<avi(a)redhat.com> wrote:
>
>
>> On 03/16/2010 02:29 PM, Ingo Molnar wrote:
>>
>
>>> I mean, i can trust a kernel service and i can trust /proc/kallsyms.
>>>
>>> Can perf trust a random process claiming to be Qemu? What's the trust
>>> mechanism here?
>>>
>> Obviously you can't trust anything you get from a guest, no matter how you
>> get it.
>>
> I'm not talking about the symbol strings and addresses, and the object
> contents for allocation (or debuginfo). I'm talking about the basic protocol
> of establishing which guest is which.
>
> I.e. we really want to be able users to:
>
> 1) have it all working with a single guest, without having to specify 'which'
> guest (qemu PID) to work with. That is the dominant usecase both for
> developers and for a fair portion of testers.
>

You're making too many assumptions.

There is no list of guests anymore than there is a list of web browsers.

You can have a multi-tenant scenario where you have distinct groups of
virtual machines running as unprivileged users.

> 2) Have some reasonable symbolic identification for guests. For example a
> usable approach would be to have 'perf kvm list', which would list all
> currently active guests:
>
> $ perf kvm list
> [1] Fedora
> [2] OpenSuse
> [3] Windows-XP
> [4] Windows-7
>
> And from that point on 'perf kvm -g OpenSuse record' would do the obvious
> thing. Users will be able to just use the 'OpenSuse' symbolic name for
> that guest, even if the guest got restarted and switched its main PID.
>

Does "perf kvm list" always run as root? What if two unprivileged users
both have a VM named "Fedora"?

If we look at the use-case, it's going to be something like, a user is
creating virtual machines and wants to get performance information about
them.

Having to run a separate tool like perf is not going to be what they
would expect they had to do. Instead, they would either use their
existing GUI tool (like virt-manager) or they would use their management
interface (either QMP or libvirt).

The complexity of interaction is due to the fact that perf shouldn't be
a stand alone tool. It should be a library or something with a
programmatic interface that another tool can make use of.

Regards,

Anthony Liguori

> Is such a scheme possible/available? I suspect all the KVM configuration tools
> (i havent used them in some time - gui and command-line tools alike) use
> similar methods to ease guest management?
>
> Ingo
> --
> To unsubscribe from this list: send the line "unsubscribe kvm" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12
Prev: + tmpfs-fix-oops-on-remounts-with-mpol=default.patch added to -mm tree
Next: [PATCH 5/5] doc: add the documentation for mpol=local