From: KOSAKI Motohiro on
> > > %check_badness.pl | sort -n | tail
> > > --
> > > 89924   3938    mixer_applet2
> > > 90210   3942    tomboy
> > > 94753   3936    clock-applet
> > > 101994  3919    pulseaudio
> > > 113525  4028    gnome-terminal
> > > 127340  1       init
> > > 128177  3871    nautilus
> > > 151003  11515   bash
> > > 256944  11653   mmap
> > > 425561  3829    gnome-session
> > > --
> > > Sigh, gnome-session has twice value of mmap(1G).
> > > Of course, gnome-session only uses 6M bytes of anon.
> > > I wonder this is because gnome-session has many children..but need to
> > > dig more. Does anyone has idea ?
> > > (CCed kosaki)
> >
> > Following output address the issue.
> > The fact is, modern desktop application linked pretty many library. it
> > makes bloat VSS size and increase
> > OOM score.
> >
> > Ideally, We shouldn't account evictable file-backed mappings for oom_score.
> >
> Hmm.
> I wonder why we consider VM size for OOM kiling.
> How about RSS size?

Because, swap out-ed bad body (e.g. fork bomb process) still should
be killed by oom.
RSS + swap-entries is acceptable to me.



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Minchan Kim on
On Tue, 27 Oct 2009 15:46:36 +0900 (JST)
KOSAKI Motohiro <kosaki.motohiro(a)jp.fujitsu.com> wrote:

> > > > %check_badness.pl | sort -n | tail
> > > > --
> > > > 89924   3938    mixer_applet2
> > > > 90210   3942    tomboy
> > > > 94753   3936    clock-applet
> > > > 101994  3919    pulseaudio
> > > > 113525  4028    gnome-terminal
> > > > 127340  1       init
> > > > 128177  3871    nautilus
> > > > 151003  11515   bash
> > > > 256944  11653   mmap
> > > > 425561  3829    gnome-session
> > > > --
> > > > Sigh, gnome-session has twice value of mmap(1G).
> > > > Of course, gnome-session only uses 6M bytes of anon.
> > > > I wonder this is because gnome-session has many children..but need to
> > > > dig more. Does anyone has idea ?
> > > > (CCed kosaki)
> > >
> > > Following output address the issue.
> > > The fact is, modern desktop application linked pretty many library. it
> > > makes bloat VSS size and increase
> > > OOM score.
> > >
> > > Ideally, We shouldn't account evictable file-backed mappings for oom_score.
> > >
> > Hmm.
> > I wonder why we consider VM size for OOM kiling.
> > How about RSS size?
>
> Because, swap out-ed bad body (e.g. fork bomb process) still should
> be killed by oom.
> RSS + swap-entries is acceptable to me.

It's reasonable to me.
As I mentioned by reply of kame, in Vedran case, he didn't use swap.
I think only considering vm is the problem.

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vedran Furač on
KAMEZAWA Hiroyuki wrote:

> On Mon, 26 Oct 2009 17:16:14 +0100
> Vedran Furač <vedran.furac(a)gmail.com> wrote:
>>> - Could you show me /var/log/dmesg and /var/log/messages at OOM ?
>> It was catastrophe. :) X crashed (or killed) with all the programs, but
>> my little program was alive for 20 minutes (see timestamps). And for
>> that time computer was completely unusable. Couldn't even get the
>> console via ssh. Rally embarrassing for a modern OS to get destroyed by
>> a 5 lines of C run as an ordinary user. Luckily screen was still alive,
>> oomk usually kills it also. See for yourself:
>>
>> dmesg: http://pastebin.com/f3f83738a
>> messages: http://pastebin.com/f2091110a
>>
>> (CCing to lklm again... I just want people to see the logs.)
>>
> Thank you for reporting and your patience. It seems something strange
> that your KDE programs are killed. I agree.

No problem. I want this to be solved as much as you do. Actually, it is
not strange, just a buggy algorithm.

Run:

% ps -T -eo pid,ppid,tid,vsz,command

You'll see that ppid of a number of processes is kdeinit, gnome-session,
fvwm or something else depending on what one is using. All of this
processes are started automatically during startup or manually clicking
on a menu item or by some keyboard shortcut. OOM algorithm just sums
memory usage of all of them and adds that ot the parent. Just plain wrong.

Also, it seems it's looking at VIRT instead of RES.

> I attached a scirpt for checking oom_score of all exisiting process.
> (oom_score is a value used for selecting "bad" processs.")
> please run if you have time.

96890 21463 VirtualBox // OK
118615 11144 kded4 // WRONG
127455 11158 knotify4 // WRONG
132198 1 init // WRONG
133940 11151 ksmserver // WRONG
134109 11224 audacious2 // Audio player, maybe
145476 21503 VirtualBox // OK
174939 11322 icedove-bin // thunderbird, maybe
178015 11223 akregator // rss reader, maybe
201043 22672 krusader // WRONG
212609 11187 krunner // WRONG
256911 24252 test // culprit, malloced 1GB
1750371 11318 run-mozilla.sh // tiny, parent of firefox threads
2044902 11141 kdeinit4 // tiny, parent of most KDE apps

> Sigh, gnome-session has twice value of mmap(1G).
> Of course, gnome-session only uses 6M bytes of anon.
> I wonder this is because gnome-session has many children..but need to

Yes it is.

Regards,

Vedran
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
>> I attached a scirpt for checking oom_score of all exisiting process.
>> (oom_score is a value used for selecting "bad" processs.")
>> please run if you have time.
>
> 96890   21463   VirtualBox // OK
> 118615  11144   kded4 // WRONG
> 127455  11158   knotify4 // WRONG
> 132198  1       init // WRONG
> 133940  11151   ksmserver // WRONG
> 134109  11224   audacious2 // Audio player, maybe
> 145476  21503   VirtualBox // OK
> 174939  11322   icedove-bin // thunderbird, maybe
> 178015  11223   akregator // rss reader, maybe
> 201043  22672   krusader  // WRONG
> 212609  11187   krunner // WRONG
> 256911  24252   test // culprit, malloced 1GB
> 1750371 11318   run-mozilla.sh // tiny, parent of firefox threads
> 2044902 11141   kdeinit4 // tiny, parent of most KDE apps

Verdran, I made alternative improvement idea. Can you please mesure
badness score
on your system?
Maybe your culprit process take biggest badness value.

Note: this patch change time related thing. So, please drink a cup of
coffee before mesurement.
small rest time makes correct test result.
From: Vedran Furač on
KOSAKI Motohiro wrote:

>>> I attached a scirpt for checking oom_score of all exisiting process.
>>> (oom_score is a value used for selecting "bad" processs.")
>>> please run if you have time.
>> 96890 21463 VirtualBox // OK
>> 118615 11144 kded4 // WRONG
>> 127455 11158 knotify4 // WRONG
>> 132198 1 init // WRONG
>> 133940 11151 ksmserver // WRONG
>> 134109 11224 audacious2 // Audio player, maybe
>> 145476 21503 VirtualBox // OK
>> 174939 11322 icedove-bin // thunderbird, maybe
>> 178015 11223 akregator // rss reader, maybe
>> 201043 22672 krusader // WRONG
>> 212609 11187 krunner // WRONG
>> 256911 24252 test // culprit, malloced 1GB
>> 1750371 11318 run-mozilla.sh // tiny, parent of firefox threads
>> 2044902 11141 kdeinit4 // tiny, parent of most KDE apps
>
> Verdran, I made alternative improvement idea. Can you please mesure
> badness score
> on your system?
> Maybe your culprit process take biggest badness value.

Thanks, I'll test it during the week. But note that not every user
reboots its computer everyday. I, for example, usually have it up for
days. And when it comes to my laptop - weeks, as I just suspend it when
I don't use it. Maybe the best way is to combine two patches. Also, you
and others could also test these patches. It is not only my kernel that
behaves strange. :)

> Note: this patch change time related thing. So, please drink a cup of
> coffee before mesurement.
> small rest time makes correct test result.

OK. :)

Regards,

Vedran

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/