From: Jeremy Fitzhardinge on
On 04/25/2010 05:11 AM, Avi Kivity wrote:
> No need to change the kernel at all; the hypervisor controls the page
> tables.

Not in Xen PV guests (the hypervisor vets guest updates, but it can't
safely make its own changes to the pagetables). (Its kind of annoying.)

J
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dan Magenheimer on
> > Well if you are saying that your primary objection to the
> > frontswap synchronous API is that it is exposed to modules via
> > some EXPORT_SYMBOLs, we can certainly fix that, at least
> > unless/until there are other pseudo-RAM devices that can use it.
> >
> > Would that resolve your concerns?
> >
>
> By external interfaces I mean the guest/hypervisor interface.
> EXPORT_SYMBOL is an internal interface as far as I'm concerned.
>
> Now, the frontswap interface is also an internal interface, but it's
> close to the external one. I'd feel much better if it was
> asynchronous.

OK, so on the one hand, you think that the proposed synchronous
interface for frontswap is insufficiently extensible for other
uses (presumably including KVM). On the other hand, you agree
that using the existing I/O subsystem is unnecessarily heavyweight.
On the third hand, Nitin has answered your questions and spent
a good part of three years finding that extending the existing swap
interface to efficiently support swap-to-pseudo-RAM requires
some kind of in-kernel notification mechanism to which Linus
has already objected.

So you are instead proposing some new guest-to-host asynchronous
notification mechanism that doesn't use the existing bio
mechanism (and so presumably not irqs), imitates or can
utilize a dma engine, and uses less cpu cycles than copying
pages. AND, for long-term maintainability, you'd like to avoid
creating a new guest-host API that does all this, even one that
is as simple and lightweight as the proposed frontswap hooks.

Does that summarize your objection well?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Avi Kivity on
On 04/27/2010 11:29 AM, Dan Magenheimer wrote:
>
> OK, so on the one hand, you think that the proposed synchronous
> interface for frontswap is insufficiently extensible for other
> uses (presumably including KVM). On the other hand, you agree
> that using the existing I/O subsystem is unnecessarily heavyweight.
> On the third hand, Nitin has answered your questions and spent
> a good part of three years finding that extending the existing swap
> interface to efficiently support swap-to-pseudo-RAM requires
> some kind of in-kernel notification mechanism to which Linus
> has already objected.
>
> So you are instead proposing some new guest-to-host asynchronous
> notification mechanism that doesn't use the existing bio
> mechanism (and so presumably not irqs),

(any notification mechanism has to use irqs if it exits the guest)

> imitates or can
> utilize a dma engine, and uses less cpu cycles than copying
> pages. AND, for long-term maintainability, you'd like to avoid
> creating a new guest-host API that does all this, even one that
> is as simple and lightweight as the proposed frontswap hooks.
>
> Does that summarize your objection well?
>

No. Adding a new async API that parallels the block layer would be
madness. My first preference would be to completely avoid new APIs. I
think that would work for swap-to-hypervisor but probably not for
compcache. Second preference is the synchronous API, third is a new
async API.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Valdis.Kletnieks on
On Sun, 25 Apr 2010 06:37:30 PDT, Dan Magenheimer said:

> While I admit that I started this whole discussion by implying
> that frontswap (and cleancache) might be useful for SSDs, I think
> we are going far astray here. Frontswap is synchronous for a
> reason: It uses real RAM, but RAM that is not directly addressable
> by a (guest) kernel.

Are there any production boxes that actually do this currently? I know IBM had
'expanded storage' on the 3090 series 20 years ago, haven't checked if the
Z-series still do that. Was very cool at the time - supported 900+ users with
128M of main memory and 256M of expanded storage, because you got the first
3,000 or so page faults per second for almost free. Oh, and the 3090 had 2
special opcodes for "move page to/from expanded", so it was a very fast but
still synchronous move (for whatever that's worth).

From: Pavel Machek on
Hi!

> > > Nevertheless, frontswap works great today with a bare-metal
> > > hypervisor. I think it stands on its own merits, regardless
> > > of one's vision of future SSD/memory technologies.
> >
> > Even when frontswapping to RAM on a bare metal hypervisor it makes
> > sense
> > to use an async API, in case you have a DMA engine on board.
>
> When pages are 2MB, this may be true. When pages are 4KB and
> copied individually, it may take longer to program a DMA engine
> than to just copy 4KB.
>
> But in any case, frontswap works fine on all existing machines
> today. If/when most commodity CPUs have an asynchronous RAM DMA
> engine, an asynchronous API may be appropriate. Or the existing
> swap API might be appropriate. Or the synchronous frontswap API
> may work fine too. Speculating further about non-existent
> hardware that might exist in the (possibly far) future is irrelevant
> to the proposed patch, which works today on all existing x86 hardware
> and on shipping software.

If we added all the apis that worked when proposed, we'd have
unmaintanable mess by about 1996.

Why can't frontswap just use existing swap api?
Pavel

--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/