Linux/Guest unmapped page cache control [Kernel]

Prev: 2.6.34-rc1: Badness at fs/proc/generic.c:316
Next: Business Proposal

From: Balbir Singh on 13 Jun 2010 14:40

* Balbir Singh <balbir(a)linux.vnet.ibm.com> [2010-06-08 21:21:46]:

> Selectively control Unmapped Page Cache (nospam version)
>
> From: Balbir Singh <balbir(a)linux.vnet.ibm.com>
>
> This patch implements unmapped page cache control via preferred
> page cache reclaim. The current patch hooks into kswapd and reclaims
> page cache if the user has requested for unmapped page control.
> This is useful in the following scenario
>
> - In a virtualized environment with cache=writethrough, we see
> double caching - (one in the host and one in the guest). As
> we try to scale guests, cache usage across the system grows.
> The goal of this patch is to reclaim page cache when Linux is running
> as a guest and get the host to hold the page cache and manage it.
> There might be temporary duplication, but in the long run, memory
> in the guests would be used for mapped pages.
> - The option is controlled via a boot option and the administrator
> can selectively turn it on, on a need to use basis.
>
> A lot of the code is borrowed from zone_reclaim_mode logic for
> __zone_reclaim(). One might argue that the with ballooning and
> KSM this feature is not very useful, but even with ballooning,
> we need extra logic to balloon multiple VM machines and it is hard
> to figure out the correct amount of memory to balloon. With these
> patches applied, each guest has a sufficient amount of free memory
> available, that can be easily seen and reclaimed by the balloon driver.
> The additional memory in the guest can be reused for additional
> applications or used to start additional guests/balance memory in
> the host.
>
> KSM currently does not de-duplicate host and guest page cache. The goal
> of this patch is to help automatically balance unmapped page cache when
> instructed to do so.
>
> There are some magic numbers in use in the code, UNMAPPED_PAGE_RATIO
> and the number of pages to reclaim when unmapped_page_control argument
> is supplied. These numbers were chosen to avoid aggressiveness in
> reaping page cache ever so frequently, at the same time providing control.
>
> The sysctl for min_unmapped_ratio provides further control from
> within the guest on the amount of unmapped pages to reclaim.
>

Are there any major objections to this patch?

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: KAMEZAWA Hiroyuki on 13 Jun 2010 20:40

On Mon, 14 Jun 2010 00:01:45 +0530
Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:

> * Balbir Singh <balbir(a)linux.vnet.ibm.com> [2010-06-08 21:21:46]:
>
> > Selectively control Unmapped Page Cache (nospam version)
> >
> > From: Balbir Singh <balbir(a)linux.vnet.ibm.com>
> >
> > This patch implements unmapped page cache control via preferred
> > page cache reclaim. The current patch hooks into kswapd and reclaims
> > page cache if the user has requested for unmapped page control.
> > This is useful in the following scenario
> >
> > - In a virtualized environment with cache=writethrough, we see
> > double caching - (one in the host and one in the guest). As
> > we try to scale guests, cache usage across the system grows.
> > The goal of this patch is to reclaim page cache when Linux is running
> > as a guest and get the host to hold the page cache and manage it.
> > There might be temporary duplication, but in the long run, memory
> > in the guests would be used for mapped pages.
> > - The option is controlled via a boot option and the administrator
> > can selectively turn it on, on a need to use basis.
> >
> > A lot of the code is borrowed from zone_reclaim_mode logic for
> > __zone_reclaim(). One might argue that the with ballooning and
> > KSM this feature is not very useful, but even with ballooning,
> > we need extra logic to balloon multiple VM machines and it is hard
> > to figure out the correct amount of memory to balloon. With these
> > patches applied, each guest has a sufficient amount of free memory
> > available, that can be easily seen and reclaimed by the balloon driver.
> > The additional memory in the guest can be reused for additional
> > applications or used to start additional guests/balance memory in
> > the host.
> >
> > KSM currently does not de-duplicate host and guest page cache. The goal
> > of this patch is to help automatically balance unmapped page cache when
> > instructed to do so.
> >
> > There are some magic numbers in use in the code, UNMAPPED_PAGE_RATIO
> > and the number of pages to reclaim when unmapped_page_control argument
> > is supplied. These numbers were chosen to avoid aggressiveness in
> > reaping page cache ever so frequently, at the same time providing control.
> >
> > The sysctl for min_unmapped_ratio provides further control from
> > within the guest on the amount of unmapped pages to reclaim.
> >
>
> Are there any major objections to this patch?
>

This kind of patch needs "how it works well" measurement.

- How did you measure the effect of the patch ? kernbench is not enough, of course.
- Why don't you believe LRU ? And if LRU doesn't work well, should it be
fixed by a knob rather than generic approach ?
- No side effects ?

- Linux vm guys tend to say, "free memory is bad memory". ok, for what
free memory created by your patch is used ? IOW, I can't see the benefit.
If free memory that your patch created will be used for another page-cache,
it will be dropped soon by your patch itself.

If your patch just drops "duplicated, but no more necessary for other kvm",
I agree your patch may increase available size of page-caches. But you just
drops unmapped pages.
Hmm.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Balbir Singh on 14 Jun 2010 03:00

* KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> [2010-06-14 09:28:19]:

> On Mon, 14 Jun 2010 00:01:45 +0530
> Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:
>
> > * Balbir Singh <balbir(a)linux.vnet.ibm.com> [2010-06-08 21:21:46]:
> >
> > > Selectively control Unmapped Page Cache (nospam version)
> > >
> > > From: Balbir Singh <balbir(a)linux.vnet.ibm.com>
> > >
> > > This patch implements unmapped page cache control via preferred
> > > page cache reclaim. The current patch hooks into kswapd and reclaims
> > > page cache if the user has requested for unmapped page control.
> > > This is useful in the following scenario
> > >
> > > - In a virtualized environment with cache=writethrough, we see
> > > double caching - (one in the host and one in the guest). As
> > > we try to scale guests, cache usage across the system grows.
> > > The goal of this patch is to reclaim page cache when Linux is running
> > > as a guest and get the host to hold the page cache and manage it.
> > > There might be temporary duplication, but in the long run, memory
> > > in the guests would be used for mapped pages.
> > > - The option is controlled via a boot option and the administrator
> > > can selectively turn it on, on a need to use basis.
> > >
> > > A lot of the code is borrowed from zone_reclaim_mode logic for
> > > __zone_reclaim(). One might argue that the with ballooning and
> > > KSM this feature is not very useful, but even with ballooning,
> > > we need extra logic to balloon multiple VM machines and it is hard
> > > to figure out the correct amount of memory to balloon. With these
> > > patches applied, each guest has a sufficient amount of free memory
> > > available, that can be easily seen and reclaimed by the balloon driver.
> > > The additional memory in the guest can be reused for additional
> > > applications or used to start additional guests/balance memory in
> > > the host.
> > >
> > > KSM currently does not de-duplicate host and guest page cache. The goal
> > > of this patch is to help automatically balance unmapped page cache when
> > > instructed to do so.
> > >
> > > There are some magic numbers in use in the code, UNMAPPED_PAGE_RATIO
> > > and the number of pages to reclaim when unmapped_page_control argument
> > > is supplied. These numbers were chosen to avoid aggressiveness in
> > > reaping page cache ever so frequently, at the same time providing control.
> > >
> > > The sysctl for min_unmapped_ratio provides further control from
> > > within the guest on the amount of unmapped pages to reclaim.
> > >
> >
> > Are there any major objections to this patch?
> >
>
> This kind of patch needs "how it works well" measurement.
>
> - How did you measure the effect of the patch ? kernbench is not enough, of course.

I can run other benchmarks as well, I will do so

> - Why don't you believe LRU ? And if LRU doesn't work well, should it be
> fixed by a knob rather than generic approach ?
> - No side effects ?

I believe in LRU, just that the problem I am trying to solve is of
using double the memory for caching the same data (consider kvm
running in cache=writethrough or writeback mode, both the hypervisor
and the guest OS maintain a page cache of the same data). As the VM's
grow the overhead is substantial. In my runs I found upto 60%
duplication in some cases.

>
> - Linux vm guys tend to say, "free memory is bad memory". ok, for what
> free memory created by your patch is used ? IOW, I can't see the benefit.
> If free memory that your patch created will be used for another page-cache,
> it will be dropped soon by your patch itself.
>

Free memory is good for cases when you want to do more in the same
system. I agree that in a bare metail environment that might be
partially true. I don't have a problem with frequently used data being
cached, but I am targetting a consolidated environment at the moment.
Moreover, the administrator has control via a boot option, so it is
non-instrusive in many ways.

> If your patch just drops "duplicated, but no more necessary for other kvm",
> I agree your patch may increase available size of page-caches. But you just
> drops unmapped pages.
>

unmapped and unused are the best targets, I plan to add slab cache control later.

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: KAMEZAWA Hiroyuki on 14 Jun 2010 03:10

On Mon, 14 Jun 2010 12:19:55 +0530
Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:
> > - Why don't you believe LRU ? And if LRU doesn't work well, should it be
> > fixed by a knob rather than generic approach ?
> > - No side effects ?
>
> I believe in LRU, just that the problem I am trying to solve is of
> using double the memory for caching the same data (consider kvm
> running in cache=writethrough or writeback mode, both the hypervisor
> and the guest OS maintain a page cache of the same data). As the VM's
> grow the overhead is substantial. In my runs I found upto 60%
> duplication in some cases.
>
>
> - Linux vm guys tend to say, "free memory is bad memory". ok, for what
> free memory created by your patch is used ? IOW, I can't see the benefit.
> If free memory that your patch created will be used for another page-cache,
> it will be dropped soon by your patch itself.
>
> Free memory is good for cases when you want to do more in the same
> system. I agree that in a bare metail environment that might be
> partially true. I don't have a problem with frequently used data being
> cached, but I am targetting a consolidated environment at the moment.
> Moreover, the administrator has control via a boot option, so it is
> non-instrusive in many ways.

It sounds that what you want is to improve performance etc. but to make it
easy sizing the system and to help admins. Right ?

From performance perspective, I don't see any advantage to drop caches
which can be dropped easily. I just use cpus for the purpose it may no
be necessary.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Balbir Singh on 14 Jun 2010 03:40

* KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> [2010-06-14 16:00:21]:

> On Mon, 14 Jun 2010 12:19:55 +0530
> Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:
> > > - Why don't you believe LRU ? And if LRU doesn't work well, should it be
> > > fixed by a knob rather than generic approach ?
> > > - No side effects ?
> >
> > I believe in LRU, just that the problem I am trying to solve is of
> > using double the memory for caching the same data (consider kvm
> > running in cache=writethrough or writeback mode, both the hypervisor
> > and the guest OS maintain a page cache of the same data). As the VM's
> > grow the overhead is substantial. In my runs I found upto 60%
> > duplication in some cases.
> >
> >
> > - Linux vm guys tend to say, "free memory is bad memory". ok, for what
> > free memory created by your patch is used ? IOW, I can't see the benefit.
> > If free memory that your patch created will be used for another page-cache,
> > it will be dropped soon by your patch itself.
> >
> > Free memory is good for cases when you want to do more in the same
> > system. I agree that in a bare metail environment that might be
> > partially true. I don't have a problem with frequently used data being
> > cached, but I am targetting a consolidated environment at the moment.
> > Moreover, the administrator has control via a boot option, so it is
> > non-instrusive in many ways.
>
> It sounds that what you want is to improve performance etc. but to make it
> easy sizing the system and to help admins. Right ?
>

Right, to allow freeing up of using double the memory to cache data.

> From performance perspective, I don't see any advantage to drop caches
> which can be dropped easily. I just use cpus for the purpose it may no
> be necessary.
>

It is not that easy, in a virtualized environment, you do directly
reclaim, but use a mechanism like ballooning and that too requires a
smart software to decide where to balloon from. This patch (optionally
if enabled) optimizes that by

1. Reducing double caching
2. Not requiring newer smarts or a management software to monitor and
balloon
3. Allows better estimation of free memory by avoiding double caching
4. Allows immediate use of free memory for other applications or
startup of newer guest instances.

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2
Prev: 2.6.34-rc1: Badness at fs/proc/generic.c:316
Next: Business Proposal