oom-kill: add lowmem usage aware oom kill handling [Kernel]

Prev: perf top: losing events?
Next: x86: Unify reboot_type selection

From: Alan Cox on 27 Jan 2010 19:20

> Now, /proc/<pid>/oom_score and /proc/<pid>/oom_adj are used by servers.

And embedded, and some desktops (including some neat experimental hacks
where windows slowly get to be bigger bigger oom targes the longer
they've been non-focussed)

> For my customers, I don't like oom black magic. I'd like to recommend to
> use memcg, of course ;) But lowmem oom cannot be handled by memcg, well.
> So I started from this.

I can't help feeling this is the wrong approach. IFF we are running out
of low memory pages then killing stuff for that reason is wrong to begin
with except in extreme cases and those extreme cases are probably also
cases the kill won't help.

If we have a movable user page (even an mlocked one) then if there is
space in other parts of memory (ie the OOM is due to a single zone
problem) we should *never* be killing in the first place, we should be
moving the page. The mlock case is a bit hairy but the non mlock case is
exactly the same sequence of operations as a page out and page in
somewhere else skipping the widdling on the disk bit in the middle.

There are cases we can't do that - eg if the kernel has it pinned for
DMA, but in that case OOM isn't going to recover the page either - at
least not until the DMA or whatever unpins it (at which point you could
just move it).

Am I missing something fundamental here ?

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: KAMEZAWA Hiroyuki on 27 Jan 2010 19:30

Thank you for comment. But I stoppped this already....

On Thu, 28 Jan 2010 00:16:36 +0000
Alan Cox <alan(a)lxorguk.ukuu.org.uk> wrote:

> > Now, /proc/<pid>/oom_score and /proc/<pid>/oom_adj are used by servers.
>
> And embedded, and some desktops (including some neat experimental hacks
> where windows slowly get to be bigger bigger oom targes the longer
> they've been non-focussed)
>
Sure.

> > For my customers, I don't like oom black magic. I'd like to recommend to
> > use memcg, of course ;) But lowmem oom cannot be handled by memcg, well.
> > So I started from this.
>
> I can't help feeling this is the wrong approach. IFF we are running out
> of low memory pages then killing stuff for that reason is wrong to begin
> with except in extreme cases and those extreme cases are probably also
> cases the kill won't help.
>
> If we have a movable user page (even an mlocked one) then if there is
> space in other parts of memory (ie the OOM is due to a single zone
> problem) we should *never* be killing in the first place, we should be
> moving the page. The mlock case is a bit hairy but the non mlock case is
> exactly the same sequence of operations as a page out and page in
> somewhere else skipping the widdling on the disk bit in the middle.
>
> There are cases we can't do that - eg if the kernel has it pinned for
> DMA, but in that case OOM isn't going to recover the page either - at
> least not until the DMA or whatever unpins it (at which point you could
> just move it).
>
> Am I missing something fundamental here ?
>

I just wanted to make oom-killer shouldn't kill sshd or X-serivce or
task launcher IOW, oom-killer shouldn't do not-reasonalble selection.

If lowmem user is killed, I'll be satisfied with the cace "Oh, the process
is killed because lowmem was in short and it used lowmem, Hmmm..." and
never be satisfied with the cace "Ohch!, F*cking OOM killer killed X-server
and 10s of innocent processes!!!".

But year, I stop this. For me, panic_on_oom=1 is all and enough.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: David Rientjes on 27 Jan 2010 20:10

On Thu, 28 Jan 2010, Alan Cox wrote:

> > Now, /proc/<pid>/oom_score and /proc/<pid>/oom_adj are used by servers.
>
> And embedded, and some desktops (including some neat experimental hacks
> where windows slowly get to be bigger bigger oom targes the longer
> they've been non-focussed)
>

Right, oom_adj is used much more widely than described.

> I can't help feeling this is the wrong approach. IFF we are running out
> of low memory pages then killing stuff for that reason is wrong to begin
> with except in extreme cases and those extreme cases are probably also
> cases the kill won't help.
>
> If we have a movable user page (even an mlocked one) then if there is
> space in other parts of memory (ie the OOM is due to a single zone
> problem) we should *never* be killing in the first place, we should be
> moving the page. The mlock case is a bit hairy but the non mlock case is
> exactly the same sequence of operations as a page out and page in
> somewhere else skipping the widdling on the disk bit in the middle.
>

Mel Gorman's memory compaction patchset will preempt direct reclaim and
the oom killer if it can defragment zones by page migration such that a
higher order allocation would now succeed.

In this specific context, both compaction and direct reclaim will have
failed so the oom killer is the only alternative. For __GFP_NOFAIL,
that's required. However, there has been some long-standing debate (and
not only for lowmem, but for all oom conditions) about when the page
allocator should simply return NULL. We've always killed something on
blocking allocations to favor current at the expense of other memory hogs,
but that may be changed soon: it may make sense to defer oom killing
completely unless the badness() score reaches a certain threshold such
that memory leakers really can be dealt with accordingly.

In the lowmem case, it certainly seems plausible to use the same behavior
that we currently do for mempolicy-constrained ooms: kill current.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Vedran Furač on 28 Jan 2010 19:30

Alan Cox wrote:

> Am I missing something fundamental here ?

Yes, the fact linux mm currently sucks. How else would you explain
possibility of killing random (often root owned) processes using a 5
lines program started by an ordinary user? Killed process could be an
apache web server or X server on a desktop. I demonstrated this flaw few
months ago here and only Kame tried to find a way to fix it but
encountered noncooperation.

I don't know what to say, really. Sad... Actually funny, when you know
that competition OS, often ridiculed by linux users, doesn't suffer any
consequences when that same 5 line program is run.

Regards,
Vedran

--
http://vedranf.net | a8e7a7783ca0d460fee090cc584adc12
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Alan Cox on 28 Jan 2010 19:40

On Fri, 29 Jan 2010 01:25:18 +0100
Vedran Furač <vedran.furac(a)gmail.com> wrote:

> Alan Cox wrote:
>
> > Am I missing something fundamental here ?
>
> Yes, the fact linux mm currently sucks. How else would you explain
> possibility of killing random (often root owned) processes using a 5
> lines program started by an ordinary user?

If you don't want to run with overcommit you turn it off. At that point
processes get memory allocations refused if they can overrun the
theoretical limit, but you generally need more swap (it's one of the
reasons why things like BSD historically have a '3 * memory' rule).

So sounds to me like a problem between the keyboard and screen (coupled
with the fact far too few desktop vendors include tools to easily set
this stuff up)

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8
Prev: perf top: losing events?
Next: x86: Unify reboot_type selection