From: Bill Davidsen on
Andreas Mohr wrote:
> On Thu, Apr 08, 2010 at 04:12:41PM -0400, Bill Davidsen wrote:
>
>> Andreas Mohr wrote:
>>
>>> Clearly there's a very, very important limiter somewhere in bio layer
>>> missing or broken, a 300M dd /dev/zero should never manage to put
>>> such an onerous penalty on a system, IMHO.
>>>
>>>
>> You are using a USB 1.1 connection, about the same speed as a floppy. If
>>
>
> Ahahahaaa. A rather distant approximation given a speed of 20kB/s vs. 987kB/s ;)
> (but I get the point you're making here)
>
> I'm not at all convinced that USB2.0 would fare any better here, though:
> after all we are buffering the file that is written to the device
> - after the fact!
> (plus there are many existing complaints of people that copying of large files
> manages to break entire machines, and I doubt many of those were using
> USB1.1)
> https://bugzilla.kernel.org/show_bug.cgi?id=13347
> https://bugzilla.kernel.org/show_bug.cgi?id=7372
> And many other reports.
>
>
>> you have not tuned your system to prevent all of the memory from being
>> used to cache writes, it will be used that way. I don't have my notes
>> handy, but I believe you need to tune the "dirty" parameters of
>> /proc/sys/vm so that it makes better use of memory.
>>
>
> Hmmmm. I don't believe that there should be much in need of being
> tuned, especially in light of default settings being so problematic.
> Of course things here are similar to the shell ulimit philosophy,
> but IMHO default behaviour should be reasonable.
>
>
>> Of course putting a fast device like SSD on a super slow connection makes
>> no sense other than as a test of system behavior on misconfigured
>> machines.
>>
>
> "because I can" (tm) :)
>
> And because I like to break systems that happen to work moderately wonderfully
> for the mainstream(?)(?!?) case of quad cores with 16GB of RAM ;)
> [well in fact I don't, but of course that just happens to happen...]
>

I will tell you one more thing you can do to test my thought that you
are totally filling memory, copy data to the device using DIRECT to keep
from dirtying cache. It will slow the copy (to a slight degree) and keep
the system responsive. I used to have a USB 2.0 disk, and you are right,
it will show the same problems. That's why I have some ideas of tuning.

And during the 2.5 development phase I played with "per fd" limits on
memory per file, which solved the problem for me. I had some educational
discussions with several developers, but this is one of those things
which has limited usefulness and development was very busy at that time
with things deemed more important, so I never tried to get it ready for
inclusion in the kernel.

--
Bill Davidsen <davidsen(a)tmr.com>
"We can't solve today's problems by using the same thinking we
used in creating them." - Einstein

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Ben Gamari on
On Thu, 8 Apr 2010 22:35:26 +0200, Andreas Mohr <andi(a)lisas.de> wrote:
> On Thu, Apr 08, 2010 at 04:12:41PM -0400, Bill Davidsen wrote:
> > Andreas Mohr wrote:
> >> Clearly there's a very, very important limiter somewhere in bio layer
> >> missing or broken, a 300M dd /dev/zero should never manage to put
> >> such an onerous penalty on a system, IMHO.
> >>
> > You are using a USB 1.1 connection, about the same speed as a floppy. If
>
> Ahahahaaa. A rather distant approximation given a speed of 20kB/s vs. 987kB/s ;)
> (but I get the point you're making here)
>
> I'm not at all convinced that USB2.0 would fare any better here, though:
> after all we are buffering the file that is written to the device
> - after the fact!
> (plus there are many existing complaints of people that copying of large files
> manages to break entire machines, and I doubt many of those were using
> USB1.1)
> https://bugzilla.kernel.org/show_bug.cgi?id=13347
> https://bugzilla.kernel.org/show_bug.cgi?id=7372
> And many other reports.

Indeed. I have found this to be a persistent problem and I really wish there
were more interest in debugging this. I have tried bringing the community's
resources to bear on this issue several[1] times, and each time we fail to
get enough of the right eyes looking at it or developer interest simply vanishes.

I've started putting together a list[2] of pertinent threads/patches/bugs/data
in hopes that this will lower the energy barrier of getting up to speed on
this issue. Hopefully this will help.

Cheers,

- Ben


[1] https://bugzilla.kernel.org/show_bug.cgi?id=12309
[2] http://goldnerlab.physics.umass.edu/wiki/BenGamari/IoWaitLatency

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> > Many applications (this one and below) are stuck in
> > wait_on_page_writeback(). I guess this is why "heavy write to
> > irrelevant partition stalls the whole system". They are stuck on page
> > allocation. Your 512MB system memory is a bit tight, so reclaim
> > pressure is a bit high, which triggers the wait-on-writeback logic.
>
> I wonder if this hacking patch may help.
>
> When creating 300MB dirty file with dd, it is creating continuous
> region of hard-to-reclaim pages in the LRU list. priority can easily
> go low when irrelevant applications' direct reclaim run into these
> regions..

Sorry I'm confused not. can you please tell us more detail explanation?
Why did lumpy reclaim cause OOM? lumpy reclaim might cause
direct reclaim slow down. but IIUC it's not cause OOM because OOM is
only occur when priority-0 reclaim failure. IO get stcking also prevent
priority reach to 0.



>
> Thanks,
> Fengguang
> ---
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index e0e5f15..f7179cf 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -1149,7 +1149,7 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
> */
> if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> lumpy_reclaim = 1;
> - else if (sc->order && priority < DEF_PRIORITY - 2)
> + else if (sc->order && priority < DEF_PRIORITY / 2)
> lumpy_reclaim = 1;
>
> pagevec_init(&pvec, 1);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo(a)vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Wu Fengguang on
On Thu, Apr 15, 2010 at 11:31:52AM +0800, KOSAKI Motohiro wrote:
> > > Many applications (this one and below) are stuck in
> > > wait_on_page_writeback(). I guess this is why "heavy write to
> > > irrelevant partition stalls the whole system". They are stuck on page
> > > allocation. Your 512MB system memory is a bit tight, so reclaim
> > > pressure is a bit high, which triggers the wait-on-writeback logic.
> >
> > I wonder if this hacking patch may help.
> >
> > When creating 300MB dirty file with dd, it is creating continuous
> > region of hard-to-reclaim pages in the LRU list. priority can easily
> > go low when irrelevant applications' direct reclaim run into these
> > regions..
>
> Sorry I'm confused not. can you please tell us more detail explanation?
> Why did lumpy reclaim cause OOM? lumpy reclaim might cause
> direct reclaim slow down. but IIUC it's not cause OOM because OOM is
> only occur when priority-0 reclaim failure.

No I'm not talking OOM. Nor lumpy reclaim.

I mean the direct reclaim can get stuck for long time, when we do
wait_on_page_writeback() on lumpy_reclaim=1.

> IO get stcking also prevent priority reach to 0.

Sure. But we can wait for IO a bit later -- after scanning 1/64 LRU
(the below patch) instead of the current 1/1024.

In Andreas' case, 512MB/1024 = 512KB, this is way too low comparing to
the 22MB writeback pages. There can easily be a continuous range of
512KB dirty/writeback pages in the LRU, which will trigger the wait
logic.

Thanks,
Fengguang

>
>
> >
> > Thanks,
> > Fengguang
> > ---
> >
> > diff --git a/mm/vmscan.c b/mm/vmscan.c
> > index e0e5f15..f7179cf 100644
> > --- a/mm/vmscan.c
> > +++ b/mm/vmscan.c
> > @@ -1149,7 +1149,7 @@ static unsigned long shrink_inactive_list(unsigned long max_scan,
> > */
> > if (sc->order > PAGE_ALLOC_COSTLY_ORDER)
> > lumpy_reclaim = 1;
> > - else if (sc->order && priority < DEF_PRIORITY - 2)
> > + else if (sc->order && priority < DEF_PRIORITY / 2)
> > lumpy_reclaim = 1;
> >
> > pagevec_init(&pvec, 1);
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo(a)vger.kernel.org
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at http://www.tux.org/lkml/
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> On Thu, Apr 15, 2010 at 11:31:52AM +0800, KOSAKI Motohiro wrote:
> > > > Many applications (this one and below) are stuck in
> > > > wait_on_page_writeback(). I guess this is why "heavy write to
> > > > irrelevant partition stalls the whole system". They are stuck on page
> > > > allocation. Your 512MB system memory is a bit tight, so reclaim
> > > > pressure is a bit high, which triggers the wait-on-writeback logic.
> > >
> > > I wonder if this hacking patch may help.
> > >
> > > When creating 300MB dirty file with dd, it is creating continuous
> > > region of hard-to-reclaim pages in the LRU list. priority can easily
> > > go low when irrelevant applications' direct reclaim run into these
> > > regions..
> >
> > Sorry I'm confused not. can you please tell us more detail explanation?
> > Why did lumpy reclaim cause OOM? lumpy reclaim might cause
> > direct reclaim slow down. but IIUC it's not cause OOM because OOM is
> > only occur when priority-0 reclaim failure.
>
> No I'm not talking OOM. Nor lumpy reclaim.
>
> I mean the direct reclaim can get stuck for long time, when we do
> wait_on_page_writeback() on lumpy_reclaim=1.
>
> > IO get stcking also prevent priority reach to 0.
>
> Sure. But we can wait for IO a bit later -- after scanning 1/64 LRU
> (the below patch) instead of the current 1/1024.
>
> In Andreas' case, 512MB/1024 = 512KB, this is way too low comparing to
> the 22MB writeback pages. There can easily be a continuous range of
> 512KB dirty/writeback pages in the LRU, which will trigger the wait
> logic.

In my feeling from your explanation, we need auto adjustment mechanism
instead change default value for special machine. no?



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/