Make sure timers have migrated before killing migration

Prev: [PATCH] btrfs: check alloc return value before use handle and struct
Next: [PATCH] POWER (BAT) of DA9052 Linux device drivers (4/9)

From: Srivatsa Vaddagiri on 24 May 2010 11:20

On Mon, May 24, 2010 at 03:28:45PM +0200, Peter Zijlstra wrote:
> On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote:
> > since _cpu_up() and _cpu_down() can never run in
> > parallel, because of cpu_add_remove_lock.
>
> Ah indeed. I guess your initial patch works then.

One thing I found surprising was that a cpu's rt-bandwidth renewal could be
dependant on another cpu's (rt-bandwidth) timer firing ontime. In this case, we
had migration/23 pulled over to CPU0 and we hung later waiting for migration/23
to exit. migration/23 was not exiting because it could not run on CPU0 (as
CPU0's rt-bandwidth had expired). This situation remained forever. I would have
expected CPU0's bandwidth to have been renewed independent of some timer on
CPU23 to fire - maybe I am missing something not obvious in the code?

- vatsa
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Peter Zijlstra on 24 May 2010 12:00

On Mon, 2010-05-24 at 20:46 +0530, Srivatsa Vaddagiri wrote:
> On Mon, May 24, 2010 at 03:28:45PM +0200, Peter Zijlstra wrote:
> > On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote:
> > > since _cpu_up() and _cpu_down() can never run in
> > > parallel, because of cpu_add_remove_lock.
> >
> > Ah indeed. I guess your initial patch works then.
>
> One thing I found surprising was that a cpu's rt-bandwidth renewal could be
> dependant on another cpu's (rt-bandwidth) timer firing ontime. In this case, we
> had migration/23 pulled over to CPU0 and we hung later waiting for migration/23
> to exit. migration/23 was not exiting because it could not run on CPU0 (as
> CPU0's rt-bandwidth had expired). This situation remained forever. I would have
> expected CPU0's bandwidth to have been renewed independent of some timer on
> CPU23 to fire - maybe I am missing something not obvious in the code?

The bandwidth constraint is per cgroup, and cgroups span cpus.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Peter Zijlstra on 25 May 2010 07:40

On Mon, 2010-05-24 at 15:29 +0530, Amit K. Arora wrote:
>
> Thus, since above race can never happen, is there any other issue with
> this patch ?

It doesn't seem to apply nicely...

>
> Signed-off-by: Amit Arora <aarora(a)in.ibm.com>
> Signed-off-by: Gautham R Shenoy <ego(a)in.ibm.com>
> ---
> diff -Nuarp linux-2.6.34.org/kernel/sched.c linux-2.6.34/kernel/sched.c
> --- linux-2.6.34.org/kernel/sched.c 2010-05-18 22:56:21.000000000 -0700
> +++ linux-2.6.34/kernel/sched.c 2010-05-18 22:58:31.000000000 -0700
> @@ -5942,14 +5942,26 @@ migration_call(struct notifier_block *nf
> cpu_rq(cpu)->migration_thread = NULL;
> break;
>
> + case CPU_POST_DEAD:
> + /*
> + * Bring the migration thread down in CPU_POST_DEAD event,
> + * since the timers should have got migrated by now and thus
> + * we should not see a deadlock between trying to kill the
> + * migration thread and the sched_rt_period_timer.
> + */
> + cpuset_lock();
> + rq = cpu_rq(cpu);
> + kthread_stop(rq->migration_thread);
> + put_task_struct(rq->migration_thread);
> + rq->migration_thread = NULL;
> + cpuset_unlock();
> + break;
> +
> case CPU_DEAD:
> case CPU_DEAD_FROZEN:
> cpuset_lock(); /* around calls to cpuset_cpus_allowed_lock() */
> migrate_live_tasks(cpu);
> rq = cpu_rq(cpu);
> - kthread_stop(rq->migration_thread);
> - put_task_struct(rq->migration_thread);
> - rq->migration_thread = NULL;
> /* Idle task back to normal (off runqueue, low prio) */
> raw_spin_lock_irq(&rq->lock);
> update_rq_clock(rq);
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Peter Zijlstra on 25 May 2010 10:30

On Tue, 2010-05-25 at 18:53 +0530, Amit K. Arora wrote:

nice :-)

Thanks!

> Signed-off-by: Amit Arora <aarora(a)in.ibm.com>
> Signed-off-by: Gautham R Shenoy <ego(a)in.ibm.com>
> Cc: Ingo Molnar <mingo(a)elte.hu>
> Cc: Peter Zijlstra <peterz(a)infradead.org>
> Cc: Tejun Heo <tj(a)kernel.org>
> ---
> diff -Nuarp linux-2.6-next-20100525.org/kernel/stop_machine.c linux-2.6-next-20100525/kernel/stop_machine.c
> --- linux-2.6-next-20100525.org/kernel/stop_machine.c 2010-05-25 14:27:51.000000000 -0400
> +++ linux-2.6-next-20100525/kernel/stop_machine.c 2010-05-25 14:30:09.000000000 -0400
> @@ -321,7 +321,7 @@ static int __cpuinit cpu_stop_cpu_callba
>
> #ifdef CONFIG_HOTPLUG_CPU
> case CPU_UP_CANCELED:
> - case CPU_DEAD:
> + case CPU_POST_DEAD:
> {
> struct cpu_stop_work *work;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Thomas Gleixner on 25 May 2010 16:30

On Thu, 20 May 2010, Peter Zijlstra wrote:

> On Wed, 2010-05-19 at 17:43 +0530, Amit K. Arora wrote:
> > Alternate Solution considered : Another option considered was to
> > increase the priority of the hrtimer cpu offline notifier, such that it
> > gets to run before scheduler's migration cpu offline notifier. In this
> > way we are sure that the timers will get migrated before migration_call
> > tries to kill migration_thread. But, this can have some non-obvious
> > implications, suggested Srivatsa.
>
>
> > On Wed, May 19, 2010 at 11:31:55AM +0200, Peter Zijlstra wrote:
> > > The other problem is more urgent though, CPU_POST_DEAD runs outside of
> > > the hotplug lock and thus the above becomes a race where we could
> > > possible kill off the migration thread of a newly brought up cpu:
> > >
> > > cpu0 - down 2
> > > cpu1 - up 2 (allocs a new migration thread, and leaks the old one)
> > > cpu0 - post_down 2 - frees the migration thread -- oops!
> >
> > Ok. So, how about adding a check in CPU_UP_PREPARE event handling too ?
> > The cpuset_lock will synchronize, and thus avoid race between killing of
> > migration_thread in up_prepare and post_dead events.
> >
> > Here is the updated patch. If you don't like this one too, do you mind
> > suggesting an alternate approach to tackle the problem ? Thanks !
>
> Right, so this isn't pretty at all..
>
> Ingo, the comment near the migration_notifier says that migration_call
> should happen before all else, but can you see anything that would break
> if we let the timer migration happen first?
>
> Thomas?

That should work, though what is killing the scheduler per rq hrtimers
_before_ we migrate stuff ? We don't want to migrate them, right ?

Thanks,

tglx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3
Prev: [PATCH] btrfs: check alloc return value before use handle and struct
Next: [PATCH] POWER (BAT) of DA9052 Linux device drivers (4/9)

Make sure timers have migrated before killing migration_thread