From: Oleg Nesterov on
On 05/27, Luis Claudio R. Goncalves wrote:
>
> It sounds plausible giving the dying task an even higher priority to be
> sure it will be scheduled sooner and free the desired memory.

As usual, I can't really comment the changes in oom logic, just minor
nits...

> @@ -413,6 +415,8 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
> */
> p->rt.time_slice = HZ;
> set_tsk_thread_flag(p, TIF_MEMDIE);
> + param.sched_priority = MAX_RT_PRIO-1;
> + sched_setscheduler(p, SCHED_FIFO, &param);
>
> force_sig(SIGKILL, p);

Probably sched_setscheduler_nocheck() makes more sense.

Minor, but perhaps it would be a bit better to send SIGKILL first,
then raise its prio.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
Hi Luis,

> On 05/27, Luis Claudio R. Goncalves wrote:
> >
> > It sounds plausible giving the dying task an even higher priority to be
> > sure it will be scheduled sooner and free the desired memory.
>
> As usual, I can't really comment the changes in oom logic, just minor
> nits...
>
> > @@ -413,6 +415,8 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
> > */
> > p->rt.time_slice = HZ;
> > set_tsk_thread_flag(p, TIF_MEMDIE);
> > + param.sched_priority = MAX_RT_PRIO-1;
> > + sched_setscheduler(p, SCHED_FIFO, &param);
> >
> > force_sig(SIGKILL, p);
>
> Probably sched_setscheduler_nocheck() makes more sense.
>
> Minor, but perhaps it would be a bit better to send SIGKILL first,
> then raise its prio.

I have no objection too. but I don't think Oleg's pointed thing is minor.
Please send updated patch.

Thanks.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Luis Claudio R. Goncalves on
On Fri, May 28, 2010 at 11:54:07AM +0900, KOSAKI Motohiro wrote:
| Hi Luis,
|
| > On 05/27, Luis Claudio R. Goncalves wrote:
| > >
| > > It sounds plausible giving the dying task an even higher priority to be
| > > sure it will be scheduled sooner and free the desired memory.
| >
| > As usual, I can't really comment the changes in oom logic, just minor
| > nits...
| >
| > > @@ -413,6 +415,8 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
| > > */
| > > p->rt.time_slice = HZ;
| > > set_tsk_thread_flag(p, TIF_MEMDIE);
| > > + param.sched_priority = MAX_RT_PRIO-1;
| > > + sched_setscheduler(p, SCHED_FIFO, &param);
| > >
| > > force_sig(SIGKILL, p);
| >
| > Probably sched_setscheduler_nocheck() makes more sense.
| >
| > Minor, but perhaps it would be a bit better to send SIGKILL first,
| > then raise its prio.
|
| I have no objection too. but I don't think Oleg's pointed thing is minor.
| Please send updated patch.
|
| Thanks.

This version of the patch addresses the suggestions from Oleg Nesterov and
Kosaki Motohiro.

Thanks again for reviewing the patch.

oom-kill: give the dying task a higher priority (v2)

In a system under heavy load it was observed that even after the
oom-killer selects a task to die, the task may take a long time to die.

Right before sending a SIGKILL to the task selected by the oom-killer
this task has it's priority increased so that it can exit() exit soon,
freeing memory. That is accomplished by:

/*
* We give our sacrificial lamb high priority and access to
* all the memory it needs. That way it should be able to
* exit() and clear out its resources quickly...
*/
p->rt.time_slice = HZ;
set_tsk_thread_flag(p, TIF_MEMDIE);

It sounds plausible giving the dying task an even higher priority to be
sure it will be scheduled sooner and free the desired memory. Oleg Nesterov
pointed out it would be interesting sending the signal before increasing
the task priority.

Signed-off-by: Luis Claudio R. Gon�alves <lclaudio(a)uudg.org>

diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index b68e802..d352b3e 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -382,6 +382,8 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
*/
static void __oom_kill_task(struct task_struct *p, int verbose)
{
+ struct sched_param param;
+
if (is_global_init(p)) {
WARN_ON(1);
printk(KERN_WARNING "tried to kill init!\n");
@@ -413,8 +415,9 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
*/
p->rt.time_slice = HZ;
set_tsk_thread_flag(p, TIF_MEMDIE);
-
force_sig(SIGKILL, p);
+ param.sched_priority = MAX_RT_PRIO-1;
+ sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
}

static int oom_kill_task(struct task_struct *p)
--
[ Luis Claudio R. Goncalves Bass - Gospel - RT ]
[ Fingerprint: 4FDD B8C4 3C59 34BD 8BE9 2696 7203 D980 A448 C8F8 ]

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Balbir Singh on
* Luis Claudio R. Goncalves <lclaudio(a)uudg.org> [2010-05-28 00:51:47]:

> @@ -382,6 +382,8 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
> */
> static void __oom_kill_task(struct task_struct *p, int verbose)
> {
> + struct sched_param param;
> +
> if (is_global_init(p)) {
> WARN_ON(1);
> printk(KERN_WARNING "tried to kill init!\n");
> @@ -413,8 +415,9 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
> */
> p->rt.time_slice = HZ;
> set_tsk_thread_flag(p, TIF_MEMDIE);
> -
> force_sig(SIGKILL, p);
> + param.sched_priority = MAX_RT_PRIO-1;
> + sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
> }
>

I would like to understand the visible benefits of this patch. Have
you seen an OOM kill tasked really get bogged down. Should this task
really be competing with other important tasks for run time?

--
Three Cheers,
Balbir
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: KOSAKI Motohiro on
> * Luis Claudio R. Goncalves <lclaudio(a)uudg.org> [2010-05-28 00:51:47]:
>
> > @@ -382,6 +382,8 @@ static void dump_header(struct task_struct *p, gfp_t gfp_mask, int order,
> > */
> > static void __oom_kill_task(struct task_struct *p, int verbose)
> > {
> > + struct sched_param param;
> > +
> > if (is_global_init(p)) {
> > WARN_ON(1);
> > printk(KERN_WARNING "tried to kill init!\n");
> > @@ -413,8 +415,9 @@ static void __oom_kill_task(struct task_struct *p, int verbose)
> > */
> > p->rt.time_slice = HZ;
> > set_tsk_thread_flag(p, TIF_MEMDIE);
> > -
> > force_sig(SIGKILL, p);
> > + param.sched_priority = MAX_RT_PRIO-1;
> > + sched_setscheduler_nocheck(p, SCHED_FIFO, &param);
> > }
> >
>
> I would like to understand the visible benefits of this patch. Have
> you seen an OOM kill tasked really get bogged down. Should this task
> really be competing with other important tasks for run time?

What you mean important? Until OOM victim task exit completely, the system have no memory.
all of important task can't do anything.

In almost kernel subsystems, automatically priority boost is really bad idea because
it may break RT task's deterministic behavior. but OOM is one of exception. The deterministic
was alread broken by memory starvation.

That's the reason I acked it.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/