From: Dave Hansen on
On Thu, 2010-06-10 at 19:55 +0530, Balbir Singh wrote:
> > I'm not sure victimizing unmapped cache pages is a good idea.
> > Shouldn't page selection use the LRU for recency information instead
> > of the cost of guest reclaim? Dropping a frequently used unmapped
> > cache page can be more expensive than dropping an unused text page
> > that was loaded as part of some executable's initialization and
> > forgotten.
>
> We victimize the unmapped cache only if it is unused (in LRU order).
> We don't force the issue too much. We also have free slab cache to go
> after.

Just to be clear, let's say we have a mapped page (say of /sbin/init)
that's been unreferenced since _just_ after the system booted. We also
have an unmapped page cache page of a file often used at runtime, say
one from /etc/resolv.conf or /etc/passwd.

Which page will be preferred for eviction with this patch set?

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on
On Thu, 10 Jun 2010 17:07:32 -0700
Dave Hansen <dave(a)linux.vnet.ibm.com> wrote:

> On Thu, 2010-06-10 at 19:55 +0530, Balbir Singh wrote:
> > > I'm not sure victimizing unmapped cache pages is a good idea.
> > > Shouldn't page selection use the LRU for recency information instead
> > > of the cost of guest reclaim? Dropping a frequently used unmapped
> > > cache page can be more expensive than dropping an unused text page
> > > that was loaded as part of some executable's initialization and
> > > forgotten.
> >
> > We victimize the unmapped cache only if it is unused (in LRU order).
> > We don't force the issue too much. We also have free slab cache to go
> > after.
>
> Just to be clear, let's say we have a mapped page (say of /sbin/init)
> that's been unreferenced since _just_ after the system booted. We also
> have an unmapped page cache page of a file often used at runtime, say
> one from /etc/resolv.conf or /etc/passwd.
>

Hmm. I'm not fan of estimating working set size by calculation
based on some numbers without considering history or feedback.

Can't we use some kind of feedback algorithm as hi-low-watermark, random walk
or GA (or somehing more smart) to detect the size ?

Thanks,
-Kame




--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KAMEZAWA Hiroyuki on
On Fri, 11 Jun 2010 10:16:32 +0530
Balbir Singh <balbir(a)linux.vnet.ibm.com> wrote:

> * KAMEZAWA Hiroyuki <kamezawa.hiroyu(a)jp.fujitsu.com> [2010-06-11 10:54:41]:
>
> > On Thu, 10 Jun 2010 17:07:32 -0700
> > Dave Hansen <dave(a)linux.vnet.ibm.com> wrote:
> >
> > > On Thu, 2010-06-10 at 19:55 +0530, Balbir Singh wrote:
> > > > > I'm not sure victimizing unmapped cache pages is a good idea.
> > > > > Shouldn't page selection use the LRU for recency information instead
> > > > > of the cost of guest reclaim? Dropping a frequently used unmapped
> > > > > cache page can be more expensive than dropping an unused text page
> > > > > that was loaded as part of some executable's initialization and
> > > > > forgotten.
> > > >
> > > > We victimize the unmapped cache only if it is unused (in LRU order).
> > > > We don't force the issue too much. We also have free slab cache to go
> > > > after.
> > >
> > > Just to be clear, let's say we have a mapped page (say of /sbin/init)
> > > that's been unreferenced since _just_ after the system booted. We also
> > > have an unmapped page cache page of a file often used at runtime, say
> > > one from /etc/resolv.conf or /etc/passwd.
> > >
> >
> > Hmm. I'm not fan of estimating working set size by calculation
> > based on some numbers without considering history or feedback.
> >
> > Can't we use some kind of feedback algorithm as hi-low-watermark, random walk
> > or GA (or somehing more smart) to detect the size ?
> >
>
> Could you please clarify at what level you are suggesting size
> detection? I assume it is outside the OS, right?
>
"OS" includes kernel and system programs ;)

I can think of both way in kernel and in user approarh and they should be
complement to each other.

An example of kernel-based approach is.
1. add a shrinker callback(A) for balloon-driver-for-guest as guest kswapd.
2. add a shrinker callback(B) for balloon-driver-for-host as host kswapd.
(I guess current balloon driver is only for host. Please imagine.)

(A) increases free memory in Guest.
(B) increases free memory in Host.

This is an example of feedback based memory resizing between host and guest.

I think (B) is necessary at least before considering complecated things.

To implement something clever, (A) and (B) should take into account that
how frequently memory reclaim in guest (which requires some I/O) happens.

If doing outside kernel, I think using memcg is better than depends on
balloon driver. But co-operative balloon and memcg may show us something
good.

Thanks,
-Kame


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Hansen on
On Mon, 2010-06-14 at 14:18 +0530, Balbir Singh wrote:
> 1. A slab page will not be freed until the entire page is free (all
> slabs have been kfree'd so to speak). Normal reclaim will definitely
> free this page, but a lot of it depends on how frequently we are
> scanning the LRU list and when this page got added.

You don't have to be freeing entire slab pages for the reclaim to have
been useful. You could just be making space so that _future_
allocations fill in the slab holes you just created. You may not be
freeing pages, but you're reducing future system pressure.

If unmapped page cache is the easiest thing to evict, then it should be
the first thing that goes when a balloon request comes in, which is the
case this patch is trying to handle. If it isn't the easiest thing to
evict, then we _shouldn't_ evict it.

-- Dave


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Dave Hansen on
On Mon, 2010-06-14 at 16:01 +0300, Avi Kivity wrote:
> If we drop unmapped pagecache pages, we need to be sure they can be
> backed by the host, and that depends on the amount of sharing.

You also have to set up the host up properly, and continue to maintain
it in a way that finds and eliminates duplicates.

I saw some benchmarks where KSM was doing great, finding lots of
duplicate pages. Then, the host filled up, and guests started
reclaiming. As memory pressure got worse, so did KSM's ability to find
duplicates.

At the same time, I see what you're trying to do with this. It really
can be an alternative to ballooning if we do it right, since ballooning
would probably evict similar pages. Although it would only work in idle
guests, what about a knob that the host can turn to just get the guest
to start running reclaim?

-- Dave

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/