From: Arve Hjønnevåg on
2010/4/23 Tejun Heo <tj(a)kernel.org>:
> Hello,
>
> On 04/24/2010 12:49 AM, Arve Hj�nnev�g wrote:
>> I want the suspend blocker active when the work is pending or running.
>> I did not see a way to do this on top of the workqueue api without
>> adding additional locking.
>
> Well, then add additional locking. �suspend_blocker is far away from
> being a hot path and there's nothing wrong with additional locking as
> long as everything is wrapped inside proper APIs. �Adding stuff to the
> core code for things as specific as this is much worse.
>

OK, I'll try to do this. Do you want it in a separate header file as well?

>> If the work is both queued and starts running on another workqueue
>> between "get_wq_data(work) == cwq" and "!work_pending(work)", then
>> suspend_unblock will be called when it shouldn't. It should work fine
>> if I change to it check pending first though, since it cannot move
>> back to the current workqueue without locking cwq->lock first.
>
> The code is more fundamentally broken. �Once work->func has started
> executing, the work pointer can't be dereferenced as work callback is
> allowed to free or re-purpose the work data structure and in the same
> manner you can't check for pending status after execution has
> completed.
>

I only touch the work structure after the callback has returned for
suspend blocking work, which does not allow that.

>> Or are you talking about the race when the callback is running on
>> multiple (cpu) workqueues at the same time. In that case the suspend
>> blocker is released when the callback returns from the last workqueue
>> is was queued on, not when all the callbacks have returned. On that
>> note, is it ever safe to use flush_work and cancel_work_sync for work
>> queues on anything other than a single single threaded workqueue?
>
> flush_work() is guaranteed only to flush from the last queued cpu but
> cancel_work_sync() will guarantee that no cpu is executing or holding
> onto the work. �So, yeah, as long as the limitations of flush_work()
> is understood, it's safe.
>

Sorry, I didn't see the for_each_cpu(cpu, cpu_map) part of
wait_on_work(). cancel_work_sync() does look safe as long as the work
has not moved to completely different workqueue.

> Going back to the original subject, just add simplistic outside
> locking in suspend_blocker_work API (or whatever other name you
> prefer). �That will be much cleaner and safer. �Let's think about
> moving them into workqueue implementation proper after the number of
> the API users grows to hundreds.
>

OK.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 04/24/2010 09:21 AM, Arve Hj�nnev�g wrote:
> OK, I'll try to do this. Do you want it in a separate header file as well?

I think it would fit better with the rest of suspend_blocker.h APIs
but it's not like I thought about it hard and long, so please take my
preference with a grain of salt.

> I only touch the work structure after the callback has returned for
> suspend blocking work, which does not allow that.

Oh, you're right. I missed the current_work_blocks_suspend test.
Sorry about that.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Oleg Nesterov on
On 04/23, Arve Hj�nnev�g wrote:
>
> On Fri, Apr 23, 2010 at 5:20 AM, Oleg Nesterov <oleg(a)redhat.com> wrote:
>
> > The unblock code in run_workqueue() is racy,
> > it can unblock after the work was queued on another CPU, cwq->lock can't
> > help.
>
> If the work is both queued and starts running on another workqueue
> between "get_wq_data(work) == cwq" and "!work_pending(work)", then
> suspend_unblock will be called when it shouldn't. It should work fine
> if I change to it check pending first though, since it cannot move
> back to the current workqueue without locking cwq->lock first.

It can be queued on the same or different workqueue on another CPU,
right before run_workqueue() does unblock.

But I guess this doesn't matter. You already discussed this all with
Tejun while I was sleeping ;) And I agree very much, the new plan looks
much better imho.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

> +static void suspend_blocking_work_complete(struct suspend_blocking_work *work)
> +{
> + unsigned long flags;
> +
> + WARN_ON(!work->active);
> + spin_lock_irqsave(&work->lock, flags);
> + if (!--work->active)
> + suspend_unblock(&work->suspend_blocker);
> + spin_unlock_irqrestore(&work->lock, flags);
> +}

Maybe work->active can be an atomic_t and the lock can be removed?

> +/**
> + * suspend_blocking_work_destroy - Destroy suspend_blocking_work
> + * @work: The work item in question
> + *
> + * If the work was ever queued on more then one workqueue all but the last
> + * workqueue must be flushed before calling suspend_blocking_work_destroy.

As it's calling cancel_work_sync(), the above is not true. As long as
no one is trying to queue it again, suspend_blocking_work_destroy() is
safe to call regardless of how the work has been used.

> +void suspend_blocking_work_destroy(struct suspend_blocking_work *work)
> +{
> + cancel_suspend_blocking_work_sync(work);
> + WARN_ON(work->active);
> + suspend_blocker_destroy(&work->suspend_blocker);
> +}
> +EXPORT_SYMBOL(suspend_blocking_work_destroy);

Other than the above, it looks good to me.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
2010/4/27 Tejun Heo <tj(a)kernel.org>:
> Hello,
>
>> +static void suspend_blocking_work_complete(struct suspend_blocking_work *work)
>> +{
>> + � � unsigned long flags;
>> +
>> + � � WARN_ON(!work->active);
>> + � � spin_lock_irqsave(&work->lock, flags);
>> + � � if (!--work->active)
>> + � � � � � � suspend_unblock(&work->suspend_blocker);
>> + � � spin_unlock_irqrestore(&work->lock, flags);
>> +}
>
> Maybe work->active can be an atomic_t and the lock can be removed?
>

I need the spinlock to prevent the work from getting re-queued before
suspend_unblock.

>> +/**
>> + * suspend_blocking_work_destroy - Destroy suspend_blocking_work
>> + * @work: The work item in question
>> + *
>> + * If the work was ever queued on more then one workqueue all but the last
>> + * workqueue must be flushed before calling suspend_blocking_work_destroy.
>
> As it's calling cancel_work_sync(), the above is not true. �As long as
> no one is trying to queue it again, suspend_blocking_work_destroy() is
> safe to call regardless of how the work has been used.
>

I'm not sure what the best terminology is here, but cancel_work_sync()
only waits for work running on all the cpu-workqueues of the last
workqueue. So, if the caller queued the work on more than one
workqueue, suspend_blocking_work_destroy does not ensure that the
suspend_blocking_work structure is not still in use (it should trigger
the WARN_ON though).

>> +void suspend_blocking_work_destroy(struct suspend_blocking_work *work)
>> +{
>> + � � cancel_suspend_blocking_work_sync(work);
>> + � � WARN_ON(work->active);
>> + � � suspend_blocker_destroy(&work->suspend_blocker);
>> +}
>> +EXPORT_SYMBOL(suspend_blocking_work_destroy);
>
> Other than the above, it looks good to me.
>
> Thanks.
>
> --
> tejun
>



--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/