From: Mel Gorman on
On Mon, Sep 21, 2009 at 12:46:34PM +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > > <SNIP>
> > > > >
> > > > > This time it is an order-6 page allocation failure for rt2870sta
> > > > > (w/ upcoming driver changes) and Linus' tree from few days ago..
> > > > >
> > > >
> > > > It's another high-order atomic allocation which is difficult to grant.
> > > > I didn't look closely, but is this the same type of thing - large allocation
> > > > failure during firmware loading? If so, is this during resume or is the
> > > > device being reloaded for some other reason?
> > >
> > > Just modprobing the driver on a system running for some time.
> > >
> >
> > Was this a common situation before?
>
> Yes, just like firmware restarts with ipw2200.
>
> > > > I suspect that there are going to be a few of these bugs cropping up
> > > > every so often where network devices are assuming large atomic
> > > > allocations will succeed because the "only time they happen" is during
> > > > boot but these days are happening at runtime for other reasons.
> > >
> > > I wouldn't go so far as calling a normal order-6 (256kB) allocation on
> > > 512MB machine with 1024MB swap a bug. Moreover such failures just never
> > > happened before 2.6.31-rc1.
> >
> > It's not that normal, it's an allocation that cannot sleep and cannot
> > reclaim. Why is something like firmware loading allocating memory like
>
> OK.
>
> > that? Is this use of GFP_ATOMIC relatively recent or has it always been
> > that way?
>
> It has always been like that.
>

Nuts, why is firmware loading depending on GFP_ATOMIC?

> > > I don't know why people don't see it but for me it has a memory management
> > > regression and reliability issue written all over it.
> > >
> >
> > Possibly but drivers that reload their firmware as a response to an
> > error condition is relatively new and loading network drivers while the
> > system is already up and running a long time does not strike me as
> > typical system behaviour.
>
> Loading drivers after boot is a typical desktop/laptop behavior, please
> think about hotplug (the hardware in question is an USB dongle).
>

In that case, how reproducible is this problem so it can be
bisected? Basically, there are no guarantees that GFP_ATOMIC allocations
of this order will succeed although you can improve the odds by increasing
min_free_kbytes. Network drivers should never have been depending on GFP_ATOMIC
succeeding like this but the hole has been dug now.

If it's happening more frequently now than it used to then either

1. The allocations are occuring more frequently where as previously a
pool might have been reused or the memory not freed for the lifetime of
the system.

2. Something has changed in the allocator. I'm not aware of recent
changes that could cause this though in such a recent time-frame.

3. Something has changed recently with respect to reclaim. There have
been changes made recently to lumpy reclaim and that might be impacting
kswapd's efforts at keeping large contiguous regions free.

4. Hotplug events that involve driver loads are more common now than they
were previously for some reason. You mention that this is a USB dongle for
example. Was it a case before that the driver loaded early and remained
resident but only active after a hotplug event? If that was the case,
the memory would be allocated once at boot. However, if an optimisation
made recently unloads those unused drivers and re-loads them later, there
would be more order-6 allocations than they were previously and manifest
as these bug reports. Is this a possibility?

The ideal would be that network drivers not make allocations like this
in the first place by, for example, DMAing the firmware across in
page-size chunks instead of one contiguous lump :/

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Bartlomiej Zolnierkiewicz on
On Monday 21 September 2009 12:56:48 Pekka Enberg wrote:
> On Mon, 2009-09-21 at 12:46 +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > I don't know why people don't see it but for me it has a memory management
> > > > regression and reliability issue written all over it.
> > >
> > > Possibly but drivers that reload their firmware as a response to an
> > > error condition is relatively new and loading network drivers while the
> > > system is already up and running a long time does not strike me as
> > > typical system behaviour.
> >
> > Loading drivers after boot is a typical desktop/laptop behavior, please
> > think about hotplug (the hardware in question is an USB dongle).
>
> Yeah, I wonder what broke things. Did the wireless stack change in
> 2.6.31-rc1 too? IIRC Mel ruled out page allocator changes as a suspect.

The thing is that the mm behavior change has been narrowed down already
over a month ago to -mm merge in 2.6.31-rc1 (as has been noted in my initial
reports), I first though that that it was -next breakage but it turned out
that it came the other way around (because -mm is not even pulled into -next
currently -- great way to set an example for other kernel maintainers BTW).

I understand that behavior change may be justified and technically correct
in itself. I also completely agree that high order allocations in certain
drivers need fixing anyway.

However there is something wrong with the big picture and the way changes
are happening. I'm not saying that I'm surprised though, especially given
the recent decline in the quality assurance and the paradigm shift that
I'm seeing (some influential top level people talking that -rc1 is fine for
testing new code now or the "new kernel new hardware" thing).

Sorry but I have no more time currently to narrow down the issue some more
(guess what, there are other kernel bugs standing in the way to bisect it
and I would have to provide some reliable way to reproduce it first) so I
see no more point in wasting people's time on this. I can certainly get by
with allocation failure here and there. Not a big deal for me personally..
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mel Gorman on
On Mon, Sep 21, 2009 at 03:12:14PM +0200, Bartlomiej Zolnierkiewicz wrote:
> On Monday 21 September 2009 12:56:48 Pekka Enberg wrote:
> > On Mon, 2009-09-21 at 12:46 +0200, Bartlomiej Zolnierkiewicz wrote:
> > > > > I don't know why people don't see it but for me it has a memory management
> > > > > regression and reliability issue written all over it.
> > > >
> > > > Possibly but drivers that reload their firmware as a response to an
> > > > error condition is relatively new and loading network drivers while the
> > > > system is already up and running a long time does not strike me as
> > > > typical system behaviour.
> > >
> > > Loading drivers after boot is a typical desktop/laptop behavior, please
> > > think about hotplug (the hardware in question is an USB dongle).
> >
> > Yeah, I wonder what broke things. Did the wireless stack change in
> > 2.6.31-rc1 too? IIRC Mel ruled out page allocator changes as a suspect.
>
> The thing is that the mm behavior change has been narrowed down already
> over a month ago to -mm merge in 2.6.31-rc1 (as has been noted in my initial
> reports), I first though that that it was -next breakage but it turned out
> that it came the other way around (because -mm is not even pulled into -next
> currently -- great way to set an example for other kernel maintainers BTW).
>

Is there a reliable reproduction case for this that narrowed it down to
2.6.31-rc1? That is the window where a number of page-allocator optimisation
patches made it in. None of them should have affected the allocator from a
fragmentation perspective though.

If you have a reliable reproduction case, testing between commits
d239171e4f6efd58d7e423853056b1b6a74f1446..a1dd268cf6306565a31a48deff8bf4f6b4b105f7
would be nice, particularly if it can be bisected within that small
window rather than a full bisect of an rc1 which I know can be a major
mess.

> I understand that behavior change may be justified and technically correct
> in itself. I also completely agree that high order allocations in certain
> drivers need fixing anyway.
>
> However there is something wrong with the big picture and the way changes
> are happening. I'm not saying that I'm surprised though, especially given
> the recent decline in the quality assurance and the paradigm shift that
> I'm seeing (some influential top level people talking that -rc1 is fine for
> testing new code now or the "new kernel new hardware" thing).
>

The quality assurance comment is a bit unfair with respect to the page
allocator. There are a lot of things that can have changed that would hose
order-6 atomic allocations. Furthermore, test cases used for mm patches
would not have taken into account such allocations as being critical. Even
if it was considered, it would have been dismissed as "it makes no sense
for drivers to be doing order-6 GFP_ATOMIC" allocations.

> Sorry but I have no more time currently to narrow down the issue some more
> (guess what, there are other kernel bugs standing in the way to bisect it
> and I would have to provide some reliable way to reproduce it first) so I
> see no more point in wasting people's time on this. I can certainly get by
> with allocation failure here and there. Not a big deal for me personally..
>

That is somewhat unfortunate. Even testing within the window above if
possible would be very helpful if you get the chance.

--
Mel Gorman
Part-time Phd Student Linux Technology Center
University of Limerick IBM Dublin Software Lab
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/