From: Peter Zijlstra on
On Mon, 2010-05-31 at 20:56 +0200, Tejun Heo wrote:
> Currently, when a cpu goes down, cpu_active is cleared before
> CPU_DOWN_PREPARE starts and cpuset configuration is updated from a
> default priority cpu notifier. When a cpu is coming up, it's set
> before CPU_ONLINE but cpuset configuration again is updated from the
> same cpu notifier.
>
> For cpu notifiers, this presents an inconsistent state. Threads which
> a CPU_DOWN_PREPARE notifier expects to be bound to the CPU can be
> migrated to other cpus because the cpu is no more inactive.
>
> Fix it by updating cpu_active in the highest priority cpu notifier and
> cpuset configuration in the second highest when a cpu is coming up.
> Down path is updated similarly. This guarantees that all other cpu
> notifiers see consistent cpu_active and cpuset configuration.
>
> This problem is triggered by cmwq. During CPU_DOWN_PREPARE, hotplug
> callback creates a kthread and kthread_bind()s it to the target cpu,
> and the thread is expected to run on that cpu.

I know we all love notifier lists, but doesn't the below code get lots
more readable if we don't play tricks with notifier priorities and
simply hardcode the few (perf/sched/cpuset) callbacks into the hotplug
paths?

Also, I'm afraid you've now inverted the relation between
cpu_active_mask and parition_sched_domains().

You need to first set/clear the active mask, then rebuild the domain.
But with your patch parition_sched_domains() gets called in the regular
DOWN_PREPARE path, while we only clear active at the very end, which
means we build the wrong domains.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Tue, 2010-06-01 at 18:58 +0200, Tejun Heo wrote:
> Hello,
>
> On 06/01/2010 05:57 PM, Peter Zijlstra wrote:
> > I know we all love notifier lists, but doesn't the below code get lots
> > more readable if we don't play tricks with notifier priorities and
> > simply hardcode the few (perf/sched/cpuset) callbacks into the hotplug
> > paths?
>
> Maybe, maybe not. In this case, I kind of like that sleep failure
> case doesn't have to be explicitly rolled back but if you like hard
> coding that's fine too.

Hurm, yeah that rollback might make things messy indeed.

> > Also, I'm afraid you've now inverted the relation between
> > cpu_active_mask and parition_sched_domains().
> >
> > You need to first set/clear the active mask, then rebuild the domain.
> > But with your patch parition_sched_domains() gets called in the regular
> > DOWN_PREPARE path, while we only clear active at the very end, which
> > means we build the wrong domains.
>
> Ah, right forgot about that. So, the things that need to be ordered
> are cpu_active mask update, cpuset configuration and sched domains,
> right?

Right, I think that should cover it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
On 06/02/2010 06:03 PM, Tejun Heo wrote:
> Okay, how about this one? Using notifiers seems better for the
> following reasons.
>
> * Rollback on failure.
>
> * cpuset/sched_domain don't expect to be called before smp
> configuration is complete. If hardcoded into cpu_up/down(),
> condition checks need to be added so that they're skipped if the
> system is bringing up the cpus for the first time.
>
> Works fine here w/ CPUSET enabled and disabled.

Ping. Peter.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Wed, 2010-06-02 at 18:03 +0200, Tejun Heo wrote:
> Currently, when a cpu goes down, cpu_active is cleared before
> CPU_DOWN_PREPARE starts and cpuset configuration is updated from a
> default priority cpu notifier. When a cpu is coming up, it's set
> before CPU_ONLINE but cpuset configuration again is updated from the
> same cpu notifier.
>
> For cpu notifiers, this presents an inconsistent state. Threads which
> a CPU_DOWN_PREPARE notifier expects to be bound to the CPU can be
> migrated to other cpus because the cpu is no more inactive.
>
> Fix it by updating cpu_active in the highest priority cpu notifier and
> cpuset configuration in the second highest when a cpu is coming up.
> Down path is updated similarly. This guarantees that all other cpu
> notifiers see consistent cpu_active and cpuset configuration.
>
> cpuset_track_online_cpus() notifier is converted to
> cpuset_update_active_cpus() which just updates the configuration and
> now called from cpuset_cpu_[in]active() notifiers registered from
> sched_init_smp(). If cpuset is disabled, cpuset_update_active_cpus()
> degenerates into partition_sched_domains() making separate notifier
> for !CONFIG_CPUSETS unnecessary.
>
> This problem is triggered by cmwq. During CPU_DOWN_PREPARE, hotplug
> callback creates a kthread and kthread_bind()s it to the target cpu,
> and the thread is expected to run on that cpu.
>
> Signed-off-by: Tejun Heo <tj(a)kernel.org>
> Cc: Rusty Russell <rusty(a)rustcorp.com.au>
> Cc: Peter Zijlstra <a.p.zijlstra(a)chello.nl>
> Cc: Ingo Molnar <mingo(a)elte.hu>
> Cc: Paul Menage <menage(a)google.com>
> ---
> Okay, how about this one? Using notifiers seems better for the
> following reasons.
>
> * Rollback on failure.
>
> * cpuset/sched_domain don't expect to be called before smp
> configuration is complete. If hardcoded into cpu_up/down(),
> condition checks need to be added so that they're skipped if the
> system is bringing up the cpus for the first time.
>
> Works fine here w/ CPUSET enabled and disabled.

OK, this looks good.

Thanks Tejun.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Peter Zijlstra on
On Fri, 2010-06-04 at 14:03 +0200, Tejun Heo wrote:
> On 06/04/2010 01:58 PM, Peter Zijlstra wrote:
> >> * Rollback on failure.
> >>
> >> * cpuset/sched_domain don't expect to be called before smp
> >> configuration is complete. If hardcoded into cpu_up/down(),
> >> condition checks need to be added so that they're skipped if the
> >> system is bringing up the cpus for the first time.
> >>
> >> Works fine here w/ CPUSET enabled and disabled.
> >
> > OK, this looks good.
>
> Cool, can I add your acked-by's to and send them towards Ingo?

Yes.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/