From: Rik van Riel on
On 04/02/2010 05:13 AM, KOSAKI Motohiro wrote:
>>> Yeah, I don't want ignore .33-stable too. if I can't find the root cause
>>> in 2-3 days, I'll revert guilty patch anyway.
>>>
>>
>> It's a good idea to avoid fixing a bug one-way-in-stable,
>> other-way-in-mainline. Because then we have new code in both trees
>> which is different. And the -stable guys sensibly like to see code get
>> a bit of a shakedown in mainline before backporting it.
>>
>> So it would be better to merge the "simple" patch into mainline, tagged
>> for -stable backporting. Then we can later implement the larger fix in
>> mainline, perhaps starting by reverting the "simple" fix.
>
> .....ok. I don't have to prevent your code maintainship. although I still
> think we need to fix the issue completely.

Agreed on the revert.

Acked-by: Rik van Riel <riel(a)redhat.com>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Rik van Riel on
On 04/05/2010 11:31 PM, Wu Fengguang wrote:
> On Tue, Apr 06, 2010 at 10:58:43AM +0800, KOSAKI Motohiro wrote:
>> Again, I didn't said his patch is no worth. I only said we don't have to
>> ignore the downside.
>
> Right, we should document both the upside and downside.

The downside is obvious: streaming IO (used once data
that does not fit in the cache) can push out data that
is used more often - requiring that it be swapped in
at a later point in time.

I understand what Shaohua's patch does, but I do not
understand the upside. What good does it do to increase
the size of the cache for streaming IO data, which is
generally touched only once?

What kind of performance benefits can we get by doing
that?

> The main difference happens when file:anon scan ratio> 100:1.
>
> For the current percent[] based computing, percent[0]=0 hence nr[0]=0
> which disables anon list scan unconditionally, for good or for bad.
>
> For the direct nr[] computing,
> - nr[0] will be 0 for typical file servers, because with priority=12
> and anon lru size< 1.6GB, nr[0] = (anon_size/4096)/100< 0
> - nr[0] will be non-zero when priority=1 and anon_size> 100 pages,
> this stops OOM for Shaohua's test case, however may not be enough to
> guarantee safety (your previous reverting patch can provide this
> guarantee).
>
> I liked Shaohua's patch a lot -- it adapts well to both the
> file-server case and the mostly-anon-pages case :)


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Wu Fengguang on
On Tue, Apr 06, 2010 at 11:40:47AM +0800, Rik van Riel wrote:
> On 04/05/2010 11:31 PM, Wu Fengguang wrote:
> > On Tue, Apr 06, 2010 at 10:58:43AM +0800, KOSAKI Motohiro wrote:
> >> Again, I didn't said his patch is no worth. I only said we don't have to
> >> ignore the downside.
> >
> > Right, we should document both the upside and downside.
>
> The downside is obvious: streaming IO (used once data
> that does not fit in the cache) can push out data that
> is used more often - requiring that it be swapped in
> at a later point in time.
>
> I understand what Shaohua's patch does, but I do not
> understand the upside. What good does it do to increase
> the size of the cache for streaming IO data, which is
> generally touched only once?

Not that bad :) With Shaohua's patch the anon list will typically
_never_ get scanned, just like before.

If it's mostly use-once IO, file:anon will be 1000 or even 10000, and
priority=12. Then only anon lists larger than 16GB or 160GB will get
nr[0] >= 1.

> What kind of performance benefits can we get by doing
> that?

So vmscan behavior and performance remain the same as before.

For really large anon list, such workload is beyond our imagination.
So we cannot assert "don't scan anon list" will be a benefit.

On the other hand, in the test case of "do stream IO when most memory
occupied by tmpfs pages", it is very bad behavior refuse to scan anon
list in normal and suddenly start scanning _the whole_ anon list when
priority hits 0. Shaohua's patch helps it by gradually increasing the
scan nr of anon list as memory pressure increases.

Thanks,
Fengguang

> > The main difference happens when file:anon scan ratio> 100:1.
> >
> > For the current percent[] based computing, percent[0]=0 hence nr[0]=0
> > which disables anon list scan unconditionally, for good or for bad.
> >
> > For the direct nr[] computing,
> > - nr[0] will be 0 for typical file servers, because with priority=12
> > and anon lru size< 1.6GB, nr[0] = (anon_size/4096)/100< 0
> > - nr[0] will be non-zero when priority=1 and anon_size> 100 pages,
> > this stops OOM for Shaohua's test case, however may not be enough to
> > guarantee safety (your previous reverting patch can provide this
> > guarantee).
> >
> > I liked Shaohua's patch a lot -- it adapts well to both the
> > file-server case and the mostly-anon-pages case :)
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Wu Fengguang on
Shaohua,

> + scan = zone_nr_lru_pages(zone, sc, l);
> + if (priority) {
> + scan >>= priority;
> + scan = (scan * fraction[file] / denominator[file]);

Ah, the (scan * fraction[file]) may overflow in 32bit kernel!

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Shaohua Li on
On Tue, Apr 06, 2010 at 12:49:10PM +0800, Wu, Fengguang wrote:
> On Tue, Apr 06, 2010 at 11:40:47AM +0800, Rik van Riel wrote:
> > On 04/05/2010 11:31 PM, Wu Fengguang wrote:
> > > On Tue, Apr 06, 2010 at 10:58:43AM +0800, KOSAKI Motohiro wrote:
> > >> Again, I didn't said his patch is no worth. I only said we don't have to
> > >> ignore the downside.
> > >
> > > Right, we should document both the upside and downside.
> >
> > The downside is obvious: streaming IO (used once data
> > that does not fit in the cache) can push out data that
> > is used more often - requiring that it be swapped in
> > at a later point in time.
> >
> > I understand what Shaohua's patch does, but I do not
> > understand the upside. What good does it do to increase
> > the size of the cache for streaming IO data, which is
> > generally touched only once?
>
> Not that bad :) With Shaohua's patch the anon list will typically
> _never_ get scanned, just like before.
>
> If it's mostly use-once IO, file:anon will be 1000 or even 10000, and
> priority=12. Then only anon lists larger than 16GB or 160GB will get
> nr[0] >= 1.
>
> > What kind of performance benefits can we get by doing
> > that?
>
> So vmscan behavior and performance remain the same as before.
>
> For really large anon list, such workload is beyond our imagination.
> So we cannot assert "don't scan anon list" will be a benefit.
>
> On the other hand, in the test case of "do stream IO when most memory
> occupied by tmpfs pages", it is very bad behavior refuse to scan anon
> list in normal and suddenly start scanning _the whole_ anon list when
> priority hits 0. Shaohua's patch helps it by gradually increasing the
> scan nr of anon list as memory pressure increases.
Yep, the gradually increasing scan nr is the main advantage in my mind.

Thanks,
Shaohua
> > > The main difference happens when file:anon scan ratio> 100:1.
> > >
> > > For the current percent[] based computing, percent[0]=0 hence nr[0]=0
> > > which disables anon list scan unconditionally, for good or for bad.
> > >
> > > For the direct nr[] computing,
> > > - nr[0] will be 0 for typical file servers, because with priority=12
> > > and anon lru size< 1.6GB, nr[0] = (anon_size/4096)/100< 0
> > > - nr[0] will be non-zero when priority=1 and anon_size> 100 pages,
> > > this stops OOM for Shaohua's test case, however may not be enough to
> > > guarantee safety (your previous reverting patch can provide this
> > > guarantee).
> > >
> > > I liked Shaohua's patch a lot -- it adapts well to both the
> > > file-server case and the mostly-anon-pages case :)
> >
> >
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/