From: David Rientjes on
On Wed, 28 Oct 2009, Vedran Furac wrote:

> > Those are practically happening simultaneously with very little memory
> > being available between each oom kill. Only later is "test" killed:
> >
> > [97240.203228] Out of memory: kill process 5005 (test) score 256912 or a child
> > [97240.206832] Killed process 5005 (test)
> >
> > Notice how the badness score is less than 1/4th of the others. So while
> > you may find it to be hogging a lot of memory, there were others that
> > consumed much more.
> ^^^^^^^^^^^^^^^^^^^^^
>
> This is just wrong. I have 3.5GB of RAM, free says that 2GB are empty
> (ignoring cache). Culprit then allocates all free memory (2GB). That
> means it is using *more* than all other processes *together*. There
> cannot be any other "that consumed much more".
>

Just post the oom killer results after using echo 1 >
/proc/sys/vm/oom_dump_tasks as requested and it will clarify why those
tasks were chosen to kill. It will also show the result of using rss
instead of total_vm and allow us to see how such a change would have
changed the killing order for your workload.

> Thanks, I'll try that... but I guess that using rss would yield better
> results.
>

We would know if you posted the data.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vedran Furač on
David Rientjes wrote:

> We would know if you posted the data.

I need to find some free time to destroy a session on a computer which I
use for work. You could easily test it yourself also as this doesn't
happen only to me.

Anyways, here it is... this time it started with ntpd:

http://pastebin.com/f3f9674a0

Regards,

Vedran
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Rientjes on
On Thu, 29 Oct 2009, Vedran Furac wrote:

> > We would know if you posted the data.
>
> I need to find some free time to destroy a session on a computer which I
> use for work. You could easily test it yourself also as this doesn't
> happen only to me.
>
> Anyways, here it is... this time it started with ntpd:
>
> http://pastebin.com/f3f9674a0
>

That oom log shows 12 ooms but no tasks actually appear to be getting
killed (there're no "Killed process 1234 (task)" found). Do you have any
idea why?

Anyway, as I posted in response to KAMEZAWA-san's patch, the change to
get_mm_rss(mm) prefers Xorg more than the current implementation.

From your log at the link above:

total_vm
669624 test
195695 krunner
187342 krusader
168881 plasma-desktop
130562 ktorrent
127081 knotify4
125881 icedove-bin
123036 akregator

rss
668738 test
42191 Xorg
30761 firefox-bin
13331 icedove-bin
10234 ktorrent
9263 akregator
8864 plasma-desktop
7532 krunner

Can you explain why Xorg is preferred as a baseline to kill rather than
krunner in your example?

Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Rientjes on
On Wed, 28 Oct 2009, KAMEZAWA Hiroyuki wrote:

> It's _not_ special to X.
>
> Almost all applications which uses many dynamica libraries can be affected by this,
> total_vm. And, as I explained to Vedran, multi-threaded program like Java can easily
> increase total_vm without using many anon_rss.
> And it's the reason I hate overcommit_memory. size of VM doesn't tell anything.
>

Right, because in Vedran's latest oom log it shows that Xorg is preferred
more than any other thread other than the memory hogging test program with
your patch than without. I pointed out a clear distinction in the killing
order using both total_vm and rss in that log and in my opinion killing
Xorg as opposed to krunner would be undesireable.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vedran Furač on
David Rientjes wrote:

> On Thu, 29 Oct 2009, Vedran Furac wrote:
>
>>> We would know if you posted the data.
>> I need to find some free time to destroy a session on a computer which I
>> use for work. You could easily test it yourself also as this doesn't
>> happen only to me.
>>
>> Anyways, here it is... this time it started with ntpd:
>>
>> http://pastebin.com/f3f9674a0
>>
>
> That oom log shows 12 ooms but no tasks actually appear to be getting
> killed (there're no "Killed process 1234 (task)" found). Do you have any
> idea why?

That's /var/log/messages. I posted it and not dmesg because whole log
didn't fit dmesg buffer, here is waht i have (compare timestamps):

% dmesg|grep -i kill

[ 1493.064458] Out of memory: kill process 6304 (kdeinit4) score 1190231
or a child
[ 1493.064467] Killed process 6409 (konqueror)
[ 1493.261149] knotify4 invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1493.261166] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1493.276528] Out of memory: kill process 6304 (kdeinit4) score 1161265
or a child
[ 1493.276538] Killed process 6411 (krusader)
[ 1499.221160] akregator invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1499.221178] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1499.236431] Out of memory: kill process 6304 (kdeinit4) score 1067593
or a child
[ 1499.236441] Killed process 6412 (irexec)
[ 1499.370192] firefox-bin invoked oom-killer: gfp_mask=0x201da,
order=0, oomkilladj=0
[ 1499.370209] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1499.385417] Out of memory: kill process 6304 (kdeinit4) score 1066861
or a child
[ 1499.385427] Killed process 6420 (xchm)
[ 1499.458304] kio_file invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1499.458333] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1499.458367] [<ffffffff81120900>] ? d_kill+0x5c/0x7c
[ 1499.473573] Out of memory: kill process 6304 (kdeinit4) score 1043690
or a child
[ 1499.473582] Killed process 6425 (kio_file)
[ 1500.250746] korgac invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1500.250765] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1500.266186] Out of memory: kill process 6304 (kdeinit4) score 1020350
or a child
[ 1500.266196] Killed process 6464 (icedove)
[ 1500.349355] syslog-ng invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1500.349371] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1500.364689] Out of memory: kill process 6304 (kdeinit4) score 1019864
or a child
[ 1500.364699] Killed process 6477 (kio_http)
[ 1500.452151] kded4 invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1500.452167] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1500.452196] [<ffffffff81120900>] ? d_kill+0x5c/0x7c
[ 1500.467307] Out of memory: kill process 6304 (kdeinit4) score 993142
or a child
[ 1500.467316] Killed process 6478 (kio_http)
[ 1500.780222] akregator invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1500.780239] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1500.796280] Out of memory: kill process 6304 (kdeinit4) score 966331
or a child
[ 1500.796290] Killed process 6484 (kio_http)
[ 1501.065374] syslog-ng invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1501.065390] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1501.080579] Out of memory: kill process 6304 (kdeinit4) score 939434
or a child
[ 1501.080587] Killed process 6486 (kio_http)
[ 1501.381188] knotify4 invoked oom-killer: gfp_mask=0x201da, order=0,
oomkilladj=0
[ 1501.381204] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1501.396338] Out of memory: kill process 6304 (kdeinit4) score 912691
or a child
[ 1501.396346] Killed process 6487 (firefox-bin)
[ 1502.661294] icedove-bin invoked oom-killer: gfp_mask=0x201da,
order=0, oomkilladj=0
[ 1502.661311] [<ffffffff810d6dd7>] ? oom_kill_process+0x9a/0x264
[ 1502.676563] Out of memory: kill process 7580 (test) score 708945 or a
child
[ 1502.676575] Killed process 7580 (test)


> Can you explain why Xorg is preferred as a baseline to kill rather than
> krunner in your example?

Krunner is a small app for running other apps and do similar things. It
shouldn't use a lot of memory. OTOH, Xorg has to hold all the pixmaps
and so on. That was expected result. Fist Xorg, then firefox and
thunderbird.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/