Linux/Guest cooperative unmapped page cache control [Kernel]

Prev: Dear Account Owner,
Next: [PATCH] rt3070: Fixed a line over 80 character warning reported by checkpatch.pl tool

From: Avi Kivity on 10 Jun 2010 05:50

On 06/08/2010 06:51 PM, Balbir Singh wrote:
> Balloon unmapped page cache pages first
>
> From: Balbir Singh<balbir(a)linux.vnet.ibm.com>
>
> This patch builds on the ballooning infrastructure by ballooning unmapped
> page cache pages first. It looks for low hanging fruit first and tries
> to reclaim clean unmapped pages first.
>

I'm not sure victimizing unmapped cache pages is a good idea. Shouldn't
page selection use the LRU for recency information instead of the cost
of guest reclaim? Dropping a frequently used unmapped cache page can be
more expensive than dropping an unused text page that was loaded as part
of some executable's initialization and forgotten.

Many workloads have many unmapped cache pages, for example static web
serving and the all-important kernel build.

> The key advantage was that it resulted in lesser RSS usage in the host and
> more cached usage, indicating that the caching had been pushed towards
> the host. The guest cached memory usage was lower and free memory in
> the guest was also higher.
>

Caching in the host is only helpful if the cache can be shared,
otherwise it's better to cache in the guest.

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Balbir Singh on 10 Jun 2010 10:30

* Avi Kivity <avi(a)redhat.com> [2010-06-10 12:43:11]:

> On 06/08/2010 06:51 PM, Balbir Singh wrote:
> >Balloon unmapped page cache pages first
> >
> >From: Balbir Singh<balbir(a)linux.vnet.ibm.com>
> >
> >This patch builds on the ballooning infrastructure by ballooning unmapped
> >page cache pages first. It looks for low hanging fruit first and tries
> >to reclaim clean unmapped pages first.
>
> I'm not sure victimizing unmapped cache pages is a good idea.
> Shouldn't page selection use the LRU for recency information instead
> of the cost of guest reclaim? Dropping a frequently used unmapped
> cache page can be more expensive than dropping an unused text page
> that was loaded as part of some executable's initialization and
> forgotten.
>

We victimize the unmapped cache only if it is unused (in LRU order).
We don't force the issue too much. We also have free slab cache to go
after.

> Many workloads have many unmapped cache pages, for example static
> web serving and the all-important kernel build.
>

I've tested kernbench, you can see the results in the original posting
and there is no observable overhead as a result of the patch in my
run.

> >The key advantage was that it resulted in lesser RSS usage in the host and
> >more cached usage, indicating that the caching had been pushed towards
> >the host. The guest cached memory usage was lower and free memory in
> >the guest was also higher.
>
> Caching in the host is only helpful if the cache can be shared,
> otherwise it's better to cache in the guest.
>

Hmm.. so we would need a ballon cache hint from the monitor, so that
it is not unconditional? Overall my results show the following

1. No drastic reduction of guest unmapped cache, just sufficient to
show lesser RSS in the host. More freeable memory (as in cached
memory + free memory) visible on the host.
2. No significant impact on the benchmark (numbers) running in the
guest.

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Balbir Singh on 11 Jun 2010 00:50

* KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> [2010-06-11 10:54:41]:

> On Thu, 10 Jun 2010 17:07:32 -0700
> Dave Hansen <dave(a)linux.vnet.ibm.com> wrote:
>
> > On Thu, 2010-06-10 at 19:55 +0530, Balbir Singh wrote:
> > > > I'm not sure victimizing unmapped cache pages is a good idea.
> > > > Shouldn't page selection use the LRU for recency information instead
> > > > of the cost of guest reclaim? Dropping a frequently used unmapped
> > > > cache page can be more expensive than dropping an unused text page
> > > > that was loaded as part of some executable's initialization and
> > > > forgotten.
> > >
> > > We victimize the unmapped cache only if it is unused (in LRU order).
> > > We don't force the issue too much. We also have free slab cache to go
> > > after.
> >
> > Just to be clear, let's say we have a mapped page (say of /sbin/init)
> > that's been unreferenced since _just_ after the system booted. We also
> > have an unmapped page cache page of a file often used at runtime, say
> > one from /etc/resolv.conf or /etc/passwd.
> >
>
> Hmm. I'm not fan of estimating working set size by calculation
> based on some numbers without considering history or feedback.
>
> Can't we use some kind of feedback algorithm as hi-low-watermark, random walk
> or GA (or somehing more smart) to detect the size ?
>

Could you please clarify at what level you are suggesting size
detection? I assume it is outside the OS, right?

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Balbir Singh on 11 Jun 2010 01:10

* Dave Hansen <dave(a)linux.vnet.ibm.com> [2010-06-10 17:07:32]:

> On Thu, 2010-06-10 at 19:55 +0530, Balbir Singh wrote:
> > > I'm not sure victimizing unmapped cache pages is a good idea.
> > > Shouldn't page selection use the LRU for recency information instead
> > > of the cost of guest reclaim? Dropping a frequently used unmapped
> > > cache page can be more expensive than dropping an unused text page
> > > that was loaded as part of some executable's initialization and
> > > forgotten.
> >
> > We victimize the unmapped cache only if it is unused (in LRU order).
> > We don't force the issue too much. We also have free slab cache to go
> > after.
>
> Just to be clear, let's say we have a mapped page (say of /sbin/init)
> that's been unreferenced since _just_ after the system booted. We also
> have an unmapped page cache page of a file often used at runtime, say
> one from /etc/resolv.conf or /etc/passwd.
>
> Which page will be preferred for eviction with this patch set?
>

In this case the order is as follows

1. First we pick free pages if any
2. If we don't have free pages, we go after unmapped page cache and
slab cache
3. If that fails as well, we go after regularly memory

In the scenario that you describe, we'll not be able to easily free up
the frequently referenced page from /etc/*. The code will move on to
step 3 and do its regular reclaim.

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Balbir Singh on 11 Jun 2010 03:10

* KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> [2010-06-11 14:05:53]:

> On Fri, 11 Jun 2010 10:16:32 +0530
> Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:
>
> > * KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> [2010-06-11 10:54:41]:
> >
> > > On Thu, 10 Jun 2010 17:07:32 -0700
> > > Dave Hansen <dave(a)linux.vnet.ibm.com> wrote:
> > >
> > > > On Thu, 2010-06-10 at 19:55 +0530, Balbir Singh wrote:
> > > > > > I'm not sure victimizing unmapped cache pages is a good idea.
> > > > > > Shouldn't page selection use the LRU for recency information instead
> > > > > > of the cost of guest reclaim? Dropping a frequently used unmapped
> > > > > > cache page can be more expensive than dropping an unused text page
> > > > > > that was loaded as part of some executable's initialization and
> > > > > > forgotten.
> > > > >
> > > > > We victimize the unmapped cache only if it is unused (in LRU order).
> > > > > We don't force the issue too much. We also have free slab cache to go
> > > > > after.
> > > >
> > > > Just to be clear, let's say we have a mapped page (say of /sbin/init)
> > > > that's been unreferenced since _just_ after the system booted. We also
> > > > have an unmapped page cache page of a file often used at runtime, say
> > > > one from /etc/resolv.conf or /etc/passwd.
> > > >
> > >
> > > Hmm. I'm not fan of estimating working set size by calculation
> > > based on some numbers without considering history or feedback.
> > >
> > > Can't we use some kind of feedback algorithm as hi-low-watermark, random walk
> > > or GA (or somehing more smart) to detect the size ?
> > >
> >
> > Could you please clarify at what level you are suggesting size
> > detection? I assume it is outside the OS, right?
> >
> "OS" includes kernel and system programs ;)
>
> I can think of both way in kernel and in user approarh and they should be
> complement to each other.
>
> An example of kernel-based approach is.
> 1. add a shrinker callback(A) for balloon-driver-for-guest as guest kswapd.
> 2. add a shrinker callback(B) for balloon-driver-for-host as host kswapd.
> (I guess current balloon driver is only for host. Please imagine.)
>
> (A) increases free memory in Guest.
> (B) increases free memory in Host.
>
> This is an example of feedback based memory resizing between host and guest.
>
> I think (B) is necessary at least before considering complecated things.

B is left to the hypervisor and the memory policy running on it. My
patches address Linux running as a guest, with a Linux hypervisor at
the moment, but that can be extended to other balloon drivers as well.

>
> To implement something clever, (A) and (B) should take into account that
> how frequently memory reclaim in guest (which requires some I/O) happens.
>

Yes, I think the policy in the hypervisor needs to look at those
details as well.

> If doing outside kernel, I think using memcg is better than depends on
> balloon driver. But co-operative balloon and memcg may show us something
> good.
>

Yes, agreed. Co-operative is better, if there is no co-operation than
memcg might be used for enforcement.

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2 3 4 5 6
Prev: Dear Account Owner,
Next: [PATCH] rt3070: Fixed a line over 80 character warning reported by checkpatch.pl tool