From: Daniel Kiper on
Hello,

My name is Daniel Kiper and I am a PhD student
at Warsaw University of Technology, Faculty of Electronics
and Information Technology (I am working on business continuity
and disaster recovery services with emphasis on Air Traffic Management).

This year I put an proposal regarding migration from memory ballooning
to memory hotplug in Xen to Google Summer of Code 2010 (it was one of
my two proposals). It was accepted and now I happy GSoC 2010 student.
My mentor is Jeremy Fitzhardinge. I would like to thank him
for his patience and supporting hand.

OK, let's go to details. When I was playing with Xen I saw that
ballooning does not give possibility to extend memory over boundary
declared at the start of system. Yes, I know that is by desing however
I thought that it is a limitation which could by very annoing in some
enviroments (I think especially about servers). That is why I decided to
develop some code which remove that one. At the beggining I thought
that it should be replaced by memory hotplyg however after some test
and discussion with Jeremy we decided to link balooning (for memory
removal) with memory hotplug (for extending memory above boundary
declared at the startup of system). Additionaly, we decided to implement
this solution for Linux Xen gustes in all forms (PV/i386,x86_64 and
HVM/i386,x86_64).

Now, I have done most of the planned tests and wrote a PoC.

Short description of current algorithm (it was prepared
for PoC and it will be changed to implement convenient
mechanism for user):
- find free (not claimed by another memory region or device)
memory region of PAGES_PER_SECTION << PAGE_SHIFT
size in iomem_resource,
- find all PFNs for choosen memory region
(addr >> PAGE_SHIFT),
- allocate memory from hypervisor by
HYPERVISOR_memory_op(XENMEM_populate_physmap, &memory_region),
- inform system about new memory region and reserve it by
mm/memory_hotplug.c:add_memory(memory_add_physaddr_to_nid(start_addr),
start_addr, PAGES_PER_SECTION << PAGE_SHIFT),
- online memory region by
mm/memory_hotplug.c:online_pages(start_addr >> PAGE_SHIFT,
PAGES_PER_SECTION << PAGE_SHIFT).

Currently, memory is added and onlined in 128MiB blocks (section size
for x86), however I am going to do that in smaller chunks.
Additionally, some things are done manually however
it will be changed in final implementation.
I would like to mention that this solution
does not require any change in Xen hypervisor.

I am going to send you first version of patch
(fully working) next week.

If you have any questions please drop me a line.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Andi Kleen on
Daniel Kiper <dkiper(a)net-space.pl> writes:
>
> OK, let's go to details. When I was playing with Xen I saw that
> ballooning does not give possibility to extend memory over boundary
> declared at the start of system. Yes, I know that is by desing however
> I thought that it is a limitation which could by very annoing in some
> enviroments (I think especially about servers). That is why I decided to
> develop some code which remove that one. At the beggining I thought
> that it should be replaced by memory hotplyg however after some test
> and discussion with Jeremy we decided to link balooning (for memory
> removal) with memory hotplug (for extending memory above boundary
> declared at the startup of system). Additionaly, we decided to implement
> this solution for Linux Xen gustes in all forms (PV/i386,x86_64 and
> HVM/i386,x86_64).

While you can do that the value is not very large because you
could just start the guests with more memory, but ballooned in
the first place (so that they don't actually use it)

The only advantage of using memory hotadd is that the mem_map doesn't
need to be pre-allocated, but that's only a few percent of the memory.

So it would only help if you want to add gigantic amounts of memory
to a VM (like >20-30x of what it already has).

One trap is also that memory hotadd is a frequent source of regressions,
so you'll likely run into existing bugs.

-Andi

--
ak(a)linux.intel.com -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Daniel Kiper on
On Thu, Jul 08, 2010 at 04:12:01PM -0700, Dan Magenheimer wrote:
> > From: Andi Kleen [mailto:andi(a)firstfloor.org]
> >
> > Daniel Kiper <dkiper(a)net-space.pl> writes:
> > >
> > > OK, let's go to details. When I was playing with Xen I saw that
> > > ballooning does not give possibility to extend memory over boundary
> > > declared at the start of system. Yes, I know that is by desing
> > however
> > > I thought that it is a limitation which could by very annoing in some
> > > enviroments (I think especially about servers). That is why I decided
> > to
> > > develop some code which remove that one. At the beggining I thought
> > > that it should be replaced by memory hotplyg however after some test
> > > and discussion with Jeremy we decided to link balooning (for memory
> > > removal) with memory hotplug (for extending memory above boundary
> > > declared at the startup of system). Additionaly, we decided to
> > implement
> > > this solution for Linux Xen gustes in all forms (PV/i386,x86_64 and
> > > HVM/i386,x86_64).
> >
> > While you can do that the value is not very large because you
> > could just start the guests with more memory, but ballooned in
> > the first place (so that they don't actually use it)
> >
> > The only advantage of using memory hotadd is that the mem_map doesn't
> > need to be pre-allocated, but that's only a few percent of the memory.
> >
> > So it would only help if you want to add gigantic amounts of memory
> > to a VM (like >20-30x of what it already has).
>
> One can envision a scenario where a cloud customer launches a
> business-critical VM with some reasonably large "maxmem" set,
> balloons up to the max, then finds out it isn't enough after
> all and would like to avoid rebooting. Or a cloud provider
> might charge for a specific maxmem, but allow the customer
> to increase maxmem if they pay more money.

Dan scenario description is very good (thx). The idea behind this
project was to serve that cases. Maybe some misunderstanding come
from short description of my proposal.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Daniel Kiper on
On Thu, Jul 08, 2010 at 04:16:00PM -0700, Jeremy Fitzhardinge wrote:
> On 07/08/2010 12:45 PM, Daniel Kiper wrote:
> > - find free (not claimed by another memory region or device)
> > memory region of PAGES_PER_SECTION << PAGE_SHIFT
> > size in iomem_resource,
>
> Presumably in the common case this will be at the end of the memory
> map? Since a typical PV domain has all its initial memory allocated low
> and doesn't have any holes.

Yes, I know about that however I think it is much better
to write more generic algorithm which also looks for
the holes (not claimed regions) in memory (maybe in the
future something changes). Additionally, this list mostly
is very short and cost of scan is considerably low.

> > - find all PFNs for choosen memory region
> > (addr >> PAGE_SHIFT),
> > - allocate memory from hypervisor by
> > HYPERVISOR_memory_op(XENMEM_populate_physmap, &memory_region),
>
> Is it actually necessary to allocate the memory at this point?

Yes, it is because mm/memory_hotplug.c:add_memory
(not exactly this one) updates memory map.

> > - inform system about new memory region and reserve it by
> > mm/memory_hotplug.c:add_memory(memory_add_physaddr_to_nid(start_addr),
> > start_addr, PAGES_PER_SECTION << PAGE_SHIFT),
> > - online memory region by
> > mm/memory_hotplug.c:online_pages(start_addr >> PAGE_SHIFT,
> > PAGES_PER_SECTION << PAGE_SHIFT).
>
> It seems to me you could add the memory (to get the new struct pages)
> and "online" it, but immediately take a reference to the page and give
> it over to the balloon driver to manage as a ballooned-out page. Then,
> when you actually need the memory, the balloon driver can provide it in
> the normal way.

I am going to do that in similar way.

> > I am going to send you first version of patch
> > (fully working) next week.
>
> Looking forward to it. What kernel is it based on?

Ver. 2.6.32.10 however I suppose it will be no problem
to move it to current version.

Daniel
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/