From: Oleg Nesterov on
On 04/23, Tejun Heo wrote:
>
> On 04/23/2010 03:08 AM, Arve Hj�nnev�g wrote:
> > Allow work to be queued that will block suspend while it is pending
> > or executing. To get the same functionality in the calling code often
> > requires a separate suspend_blocker for pending and executing work, or
> > additional state and locking.
>
> Hmm... I think this can be implemented as pure wrapper around
> workqueue instead of injecting a flag and code into workqueue core.
> Adding @fn field to suspend_blocking_work struct and using a custom
> work function to call it and then invoke suspend_unblock() should be
> enough, right? Oh, dedicated queue functions will be needed too. I
> don't think it's wise to meddle with workqueue core code for this.

Completely agreed. The patch adds very "strange" hacks into workqueue
code to solve the very specific problems.


Besides, the patch doesn't look right. suspend_unblock() can be called
twice if you use cancel_work(). Perhaps this is not a problem, I dunno.
WORK_STRUCT_SUSPEND_BLOCKING needs to ensure that cpu_workqueue_struct
has a proper alignment. The unblock code in run_workqueue() is racy,
it can unblock after the work was queued on another CPU, cwq->lock can't
help.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 04/23/2010 03:08 AM, Arve Hjønnevåg wrote:
> Allow work to be queued that will block suspend while it is pending
> or executing. To get the same functionality in the calling code often
> requires a separate suspend_blocker for pending and executing work, or
> additional state and locking.

Hmm... I think this can be implemented as pure wrapper around
workqueue instead of injecting a flag and code into workqueue core.
Adding @fn field to suspend_blocking_work struct and using a custom
work function to call it and then invoke suspend_unblock() should be
enough, right? Oh, dedicated queue functions will be needed too. I
don't think it's wise to meddle with workqueue core code for this.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
On Fri, Apr 23, 2010 at 5:20 AM, Oleg Nesterov <oleg(a)redhat.com> wrote:
> On 04/23, Tejun Heo wrote:
>>
>> On 04/23/2010 03:08 AM, Arve Hj�nnev�g wrote:
>> > Allow work to be queued that will block suspend while it is pending
>> > or executing. To get the same functionality in the calling code often
>> > requires a separate suspend_blocker for pending and executing work, or
>> > additional state and locking.
>>
>> Hmm... I think this can be implemented as pure wrapper around
>> workqueue instead of injecting a flag and code into workqueue core.
>> Adding @fn field to suspend_blocking_work struct and using a custom
>> work function to call it and then invoke suspend_unblock() should be
>> enough, right? �Oh, dedicated queue functions will be needed too. �I
>> don't think it's wise to meddle with workqueue core code for this.
>
> Completely agreed. The patch adds very "strange" hacks into workqueue
> code to solve the very specific problems.
>

I want the suspend blocker active when the work is pending or running.
I did not see a way to do this on top of the workqueue api without
adding additional locking.

>
> Besides, the patch doesn't look right. suspend_unblock() can be called
> twice if you use cancel_work(). Perhaps this is not a problem, I dunno.

Calling suspend_unblock() twice is not a problem as long as
"unblocked" is the expected final state.

> WORK_STRUCT_SUSPEND_BLOCKING needs to ensure that cpu_workqueue_struct
> has a proper alignment.

OK.

> The unblock code in run_workqueue() is racy,
> it can unblock after the work was queued on another CPU, cwq->lock can't
> help.

If the work is both queued and starts running on another workqueue
between "get_wq_data(work) == cwq" and "!work_pending(work)", then
suspend_unblock will be called when it shouldn't. It should work fine
if I change to it check pending first though, since it cannot move
back to the current workqueue without locking cwq->lock first.

Or are you talking about the race when the callback is running on
multiple (cpu) workqueues at the same time. In that case the suspend
blocker is released when the callback returns from the last workqueue
is was queued on, not when all the callbacks have returned. On that
note, is it ever safe to use flush_work and cancel_work_sync for work
queues on anything other than a single single threaded workqueue?

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Arve Hjønnevåg on
2010/4/23 Arve Hj�nnev�g <arve(a)android.com>:
> On Fri, Apr 23, 2010 at 5:20 AM, Oleg Nesterov <oleg(a)redhat.com> wrote:
>> On 04/23, Tejun Heo wrote:
>>>
>>> On 04/23/2010 03:08 AM, Arve Hj�nnev�g wrote:
>>> > Allow work to be queued that will block suspend while it is pending
>>> > or executing. To get the same functionality in the calling code often
>>> > requires a separate suspend_blocker for pending and executing work, or
>>> > additional state and locking.
>>>
>>> Hmm... I think this can be implemented as pure wrapper around
>>> workqueue instead of injecting a flag and code into workqueue core.
>>> Adding @fn field to suspend_blocking_work struct and using a custom
>>> work function to call it and then invoke suspend_unblock() should be
>>> enough, right? �Oh, dedicated queue functions will be needed too. �I
>>> don't think it's wise to meddle with workqueue core code for this.
>>
>> Completely agreed. The patch adds very "strange" hacks into workqueue
>> code to solve the very specific problems.
>>
>
> I want the suspend blocker active when the work is pending or running.
> I did not see a way to do this on top of the workqueue api without
> adding additional locking.
>

I can remove the WORK_STRUCT_SUSPEND_BLOCKING flag and instead add an
argument to __queue_work and __cancel_work_timer (or use separate
copies) if you think that would be better. It would avoid increasing
the alignment requirement of cpu_workqueue_struct, but it will
probably add more code than the current patch.

--
Arve Hj�nnev�g
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 04/24/2010 12:49 AM, Arve Hj�nnev�g wrote:
> I want the suspend blocker active when the work is pending or running.
> I did not see a way to do this on top of the workqueue api without
> adding additional locking.

Well, then add additional locking. suspend_blocker is far away from
being a hot path and there's nothing wrong with additional locking as
long as everything is wrapped inside proper APIs. Adding stuff to the
core code for things as specific as this is much worse.

> If the work is both queued and starts running on another workqueue
> between "get_wq_data(work) == cwq" and "!work_pending(work)", then
> suspend_unblock will be called when it shouldn't. It should work fine
> if I change to it check pending first though, since it cannot move
> back to the current workqueue without locking cwq->lock first.

The code is more fundamentally broken. Once work->func has started
executing, the work pointer can't be dereferenced as work callback is
allowed to free or re-purpose the work data structure and in the same
manner you can't check for pending status after execution has
completed.

> Or are you talking about the race when the callback is running on
> multiple (cpu) workqueues at the same time. In that case the suspend
> blocker is released when the callback returns from the last workqueue
> is was queued on, not when all the callbacks have returned. On that
> note, is it ever safe to use flush_work and cancel_work_sync for work
> queues on anything other than a single single threaded workqueue?

flush_work() is guaranteed only to flush from the last queued cpu but
cancel_work_sync() will guarantee that no cpu is executing or holding
onto the work. So, yeah, as long as the limitations of flush_work()
is understood, it's safe.

Going back to the original subject, just add simplistic outside
locking in suspend_blocker_work API (or whatever other name you
prefer). That will be much cleaner and safer. Let's think about
moving them into workqueue implementation proper after the number of
the API users grows to hundreds.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/