From: KOSAKI Motohiro on
Hi

> Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression.
> With it, our tmpfs test always oom. The test has a lot of rotated anon
> pages and cause percent[0] zero. Actually the percent[0] is a very small
> value, but our calculation round it to zero. The commit makes vmscan
> completely skip anon pages and cause oops.
> An option is if percent[x] is zero in get_scan_ratio(), forces it
> to 1. See below patch.
> But the offending commit still changes behavior. Without the commit, we scan
> all pages if priority is zero, below patch doesn't fix this. Don't know if
> It's required to fix this too.

Can you please post your /proc/meminfo and reproduce program? I'll digg it.

Very unfortunately, this patch isn't acceptable. In past time, vmscan
had similar logic, but 1% swap-out made lots bug reports.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Shaohua Li on
On Tue, 2010-03-30 at 14:08 +0800, KOSAKI Motohiro wrote:
> Hi
>
> > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression.
> > With it, our tmpfs test always oom. The test has a lot of rotated anon
> > pages and cause percent[0] zero. Actually the percent[0] is a very small
> > value, but our calculation round it to zero. The commit makes vmscan
> > completely skip anon pages and cause oops.
> > An option is if percent[x] is zero in get_scan_ratio(), forces it
> > to 1. See below patch.
> > But the offending commit still changes behavior. Without the commit, we scan
> > all pages if priority is zero, below patch doesn't fix this. Don't know if
> > It's required to fix this too.
>
> Can you please post your /proc/meminfo
attached.
> and reproduce program? I'll digg it.
our test is quite sample. mount tmpfs with double memory size and store several
copies (memory size * 2/G) of kernel in tmpfs, and then do kernel build.
for example, there is 3G memory and then tmpfs size is 6G and there is 6
kernel copy.
> Very unfortunately, this patch isn't acceptable. In past time, vmscan
> had similar logic, but 1% swap-out made lots bug reports.
can you elaborate this?
Completely restore previous behavior (do full scan with priority 0) is
ok too.
From: KOSAKI Motohiro on
> On Tue, 2010-03-30 at 14:08 +0800, KOSAKI Motohiro wrote:
> > Hi
> >
> > > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression.
> > > With it, our tmpfs test always oom. The test has a lot of rotated anon
> > > pages and cause percent[0] zero. Actually the percent[0] is a very small
> > > value, but our calculation round it to zero. The commit makes vmscan
> > > completely skip anon pages and cause oops.
> > > An option is if percent[x] is zero in get_scan_ratio(), forces it
> > > to 1. See below patch.
> > > But the offending commit still changes behavior. Without the commit, we scan
> > > all pages if priority is zero, below patch doesn't fix this. Don't know if
> > > It's required to fix this too.
> >
> > Can you please post your /proc/meminfo
> attached.
> > and reproduce program? I'll digg it.
> our test is quite sample. mount tmpfs with double memory size and store several
> copies (memory size * 2/G) of kernel in tmpfs, and then do kernel build.
> for example, there is 3G memory and then tmpfs size is 6G and there is 6
> kernel copy.

Wow, tmpfs size > memsize!


> > Very unfortunately, this patch isn't acceptable. In past time, vmscan
> > had similar logic, but 1% swap-out made lots bug reports.
> can you elaborate this?
> Completely restore previous behavior (do full scan with priority 0) is
> ok too.

This is a option. but we need to know the root cause anyway. if not,
we might reintroduce this issue again in the future.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Shaohua Li on
On Tue, Mar 30, 2010 at 02:40:07PM +0800, KOSAKI Motohiro wrote:
> > On Tue, 2010-03-30 at 14:08 +0800, KOSAKI Motohiro wrote:
> > > Hi
> > >
> > > > Commit 84b18490d1f1bc7ed5095c929f78bc002eb70f26 introduces a regression.
> > > > With it, our tmpfs test always oom. The test has a lot of rotated anon
> > > > pages and cause percent[0] zero. Actually the percent[0] is a very small
> > > > value, but our calculation round it to zero. The commit makes vmscan
> > > > completely skip anon pages and cause oops.
> > > > An option is if percent[x] is zero in get_scan_ratio(), forces it
> > > > to 1. See below patch.
> > > > But the offending commit still changes behavior. Without the commit, we scan
> > > > all pages if priority is zero, below patch doesn't fix this. Don't know if
> > > > It's required to fix this too.
> > >
> > > Can you please post your /proc/meminfo
> > attached.
> > > and reproduce program? I'll digg it.
> > our test is quite sample. mount tmpfs with double memory size and store several
> > copies (memory size * 2/G) of kernel in tmpfs, and then do kernel build.
> > for example, there is 3G memory and then tmpfs size is 6G and there is 6
> > kernel copy.
>
> Wow, tmpfs size > memsize!
>
>
> > > Very unfortunately, this patch isn't acceptable. In past time, vmscan
> > > had similar logic, but 1% swap-out made lots bug reports.
> > can you elaborate this?
> > Completely restore previous behavior (do full scan with priority 0) is
> > ok too.
>
> This is a option. but we need to know the root cause anyway.
I thought I mentioned the root cause in first mail. My debug shows
recent_rotated[0] is big, but recent_rotated[1] is almost zero, which makes
percent[0] 0. But you can double check too.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> > > > Very unfortunately, this patch isn't acceptable. In past time, vmscan
> > > > had similar logic, but 1% swap-out made lots bug reports.
> > > can you elaborate this?
> > > Completely restore previous behavior (do full scan with priority 0) is
> > > ok too.
> >
> > This is a option. but we need to know the root cause anyway.
> I thought I mentioned the root cause in first mail. My debug shows
> recent_rotated[0] is big, but recent_rotated[1] is almost zero, which makes
> percent[0] 0. But you can double check too.

To revert can save percent[0]==0 && priority==0 case. but it shouldn't
happen, I think. It mean to happen big latency issue.

Can you please try following patch? Also, I'll prepare reproduce environment soon.



---
mm/vmscan.c | 12 ++++++++----
1 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/mm/vmscan.c b/mm/vmscan.c
index 79c8098..abf7f79 100644
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -1571,15 +1571,19 @@ static void get_scan_ratio(struct zone *zone, struct scan_control *sc,
*/
if (unlikely(reclaim_stat->recent_scanned[0] > anon / 4)) {
spin_lock_irq(&zone->lru_lock);
- reclaim_stat->recent_scanned[0] /= 2;
- reclaim_stat->recent_rotated[0] /= 2;
+ while (reclaim_stat->recent_scanned[0] > anon / 4) {
+ reclaim_stat->recent_scanned[0] /= 2;
+ reclaim_stat->recent_rotated[0] /= 2;
+ }
spin_unlock_irq(&zone->lru_lock);
}

if (unlikely(reclaim_stat->recent_scanned[1] > file / 4)) {
spin_lock_irq(&zone->lru_lock);
- reclaim_stat->recent_scanned[1] /= 2;
- reclaim_stat->recent_rotated[1] /= 2;
+ while (reclaim_stat->recent_scanned[1] > file / 4) {
+ reclaim_stat->recent_scanned[1] /= 2;
+ reclaim_stat->recent_rotated[1] /= 2;
+ }
spin_unlock_irq(&zone->lru_lock);
}

--
1.6.5.2





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/