From: Américo Wang on
On Sun, Feb 21, 2010 at 7:22 PM, Johannes Berg
<johannes(a)sipsolutions.net> wrote:
> On Sun, 2010-02-21 at 12:14 +0100, Johannes Berg wrote:
>
>>         printk("got cpu\n");
>>         for_each_online_cpu(i) {
>>                 sm_work = per_cpu_ptr(stop_machine_work, i);
>>                 INIT_WORK(sm_work, stop_cpu);
>>                 queue_work_on(i, stop_machine_wq, sm_work);
>>         }
>>         /* This will release the thread on our CPU. */
>>         put_cpu();
>>         printk("put cpu\n");
>
> As odd as that may be, it hangs in put_cpu() here.
>

Hmm, does adding synchronize_sched() in _cpu_down() help?

Something like this:

diff --git a/kernel/cpu.c b/kernel/cpu.c
index 677f253..681f5c5 100644
--- a/kernel/cpu.c
+++ b/kernel/cpu.c
@@ -228,6 +228,7 @@ static int __ref _cpu_down(unsigned int cpu, int
tasks_frozen)
cpumask_copy(old_allowed, &current->cpus_allowed);
set_cpus_allowed_ptr(current, cpu_active_mask);

+ synchronize_sched();
err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu));
if (err) {
set_cpu_active(cpu, true);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Johannes Berg on
On Mon, 2010-02-22 at 16:19 +0800, Américo Wang wrote:

> >> Could you test the patch below? Thanks!
> >
> > No change, sorry, still hangs right after "Disabling non-boot CPUs
> ..."
> > before the machine turns off.
> >
>
> Oh, I see, then this will be another problem.
>
> My previous patch is to fix the cpufreq lockdep warning mentioned
> in Benjamin's report, so this hang should be caused by other problem,
> not the cpufreq lockdep problem.

Right, sounds like -- and I haven't seen that lockdep report during
shutdown any more.

johannes
From: Johannes Berg on
On Mon, 2010-02-22 at 16:34 +0800, Américo Wang wrote:
> On Sun, Feb 21, 2010 at 7:22 PM, Johannes Berg
> <johannes(a)sipsolutions.net> wrote:
> > On Sun, 2010-02-21 at 12:14 +0100, Johannes Berg wrote:
> >
> >> printk("got cpu\n");
> >> for_each_online_cpu(i) {
> >> sm_work = per_cpu_ptr(stop_machine_work,
> i);
> >> INIT_WORK(sm_work, stop_cpu);
> >> queue_work_on(i, stop_machine_wq, sm_work);
> >> }
> >> /* This will release the thread on our CPU. */
> >> put_cpu();
> >> printk("put cpu\n");
> >
> > As odd as that may be, it hangs in put_cpu() here.
> >
>
> Hmm, does adding synchronize_sched() in _cpu_down() help?

No luck.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Johannes Berg on
On Mon, 2010-02-22 at 17:12 +0800, Américo Wang wrote:

> Since it hangs in put_cpu() which is just preempt_enable(), so I began
> to suspect if we need a synchronize_sched(), or some barrier perhaps.
> I am not sure at all.

Right.

> Before other experts look at this, I think doing a bisect would be
> very useful.

I was afraid you'd say that, it'll take forever though since I need to
walk over to it after every shutdown, see if it turned off and then turn
it on again (and possibly off).... I guess I'll get started on that.

johannes

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Américo Wang on
On Mon, Feb 22, 2010 at 5:04 PM, Johannes Berg
<johannes(a)sipsolutions.net> wrote:
> On Mon, 2010-02-22 at 16:34 +0800, Américo Wang wrote:
>> On Sun, Feb 21, 2010 at 7:22 PM, Johannes Berg
>> <johannes(a)sipsolutions.net> wrote:
>> > On Sun, 2010-02-21 at 12:14 +0100, Johannes Berg wrote:
>> >
>> >>         printk("got cpu\n");
>> >>         for_each_online_cpu(i) {
>> >>                 sm_work = per_cpu_ptr(stop_machine_work,
>> i);
>> >>                 INIT_WORK(sm_work, stop_cpu);
>> >>                 queue_work_on(i, stop_machine_wq, sm_work);
>> >>         }
>> >>         /* This will release the thread on our CPU. */
>> >>         put_cpu();
>> >>         printk("put cpu\n");
>> >
>> > As odd as that may be, it hangs in put_cpu() here.
>> >
>>
>> Hmm, does adding synchronize_sched() in _cpu_down() help?
>
> No luck.
>

Ok, thanks.

Since it hangs in put_cpu() which is just preempt_enable(), so I began
to suspect if we need a synchronize_sched(), or some barrier perhaps.
I am not sure at all.

Before other experts look at this, I think doing a bisect would be very
useful.

Again, thanks for your testing!
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/