vhost: replace vhost_workqueue with per-vhost kthread [Kernel]

Prev: firewire: new driver: nosy - IEEE 1394 traffic sniffer
Next: Small perf optimization

From: Michael S. Tsirkin on 26 Jul 2010 12:40

On Mon, Jul 26, 2010 at 06:14:30PM +0200, Tejun Heo wrote:
> Just one more thing.
>
> On 07/26/2010 06:05 PM, Tejun Heo wrote:
> > * Placing try_to_freeze() could be a bit annoying. It shouldn't be
> > executed when there's a work to flush.

BTW why is this important?
We could always get another work and flush right after
try_to_freeze, and then flush would block for a long time.

BTW the vhost patch you sent does not do this at all.
I am guessing it is because our thread is not freezable?

> * Similar issue exists for kthread_stop(). The kthread shouldn't exit
> while there's a work to flush (please note that kthread_worker
> interface allows detaching / attaching worker kthread during
> operation, so it should remain in consistent state with regard to
> flushing).
>
> Thanks.

Not sure I agree here. Users must synchronise flush and stop calls.
Otherwise a work might get queued after stop is called, and
you won't be able to flush it.

> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Michael S. Tsirkin on 26 Jul 2010 13:00

On Mon, Jul 26, 2010 at 06:14:30PM +0200, Tejun Heo wrote:
> Just one more thing.

I noticed that with vhost, flush_work was getting the worker
pointer as well. Can we live with this API change?

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Michael S. Tsirkin on 26 Jul 2010 13:10

Here's an untested patch forward-ported from vhost
(works fine for vhost).

kthread_worker: replace barriers+atomics with a lock

We can save some cycles and make code simpler by
reusing worker lock for flush, instead of atomics.
flush_kthread_work needs to get worker pointer for
this to work.

Signed-off-by: Michael S. Tsirkin <mst(a)redhat.com>

---

diff --git a/include/linux/kthread.h b/include/linux/kthread.h
index 685ea65..19ae9f2 100644
--- a/include/linux/kthread.h
+++ b/include/linux/kthread.h
@@ -58,7 +58,7 @@ struct kthread_work {
struct list_head node;
kthread_work_func_t func;
wait_queue_head_t done;
- atomic_t flushing;
+ int flushing;
int queue_seq;
int done_seq;
};
@@ -72,7 +72,7 @@ struct kthread_work {
.node = LIST_HEAD_INIT((work).node), \
.func = (fn), \
.done = __WAIT_QUEUE_HEAD_INITIALIZER((work).done), \
- .flushing = ATOMIC_INIT(0), \
+ .flushing = 0, \
}

#define DEFINE_KTHREAD_WORKER(worker) \
@@ -96,7 +96,8 @@ int kthread_worker_fn(void *worker_ptr);

bool queue_kthread_work(struct kthread_worker *worker,
struct kthread_work *work);
-void flush_kthread_work(struct kthread_work *work);
+void flush_kthread_work(struct kthread_worker *worker,
+ struct kthread_work *work);
void flush_kthread_worker(struct kthread_worker *worker);

#endif /* _LINUX_KTHREAD_H */
diff --git a/kernel/kthread.c b/kernel/kthread.c
index 2dc3786..461f58d 100644
--- a/kernel/kthread.c
+++ b/kernel/kthread.c
@@ -283,10 +283,12 @@ int kthreadd(void *unused)
int kthread_worker_fn(void *worker_ptr)
{
struct kthread_worker *worker = worker_ptr;
- struct kthread_work *work;
+ struct kthread_work *work = NULL;

+ spin_lock_irq(&worker->lock);
WARN_ON(worker->task);
worker->task = current;
+ spin_unlock_irq(&worker->lock);
repeat:
set_current_state(TASK_INTERRUPTIBLE); /* mb paired w/ kthread_stop */

@@ -298,23 +300,23 @@ repeat:
return 0;
}

- work = NULL;
spin_lock_irq(&worker->lock);
+ if (work) {
+ work->done_seq = work->queue_seq;
+ if (work->flushing)
+ wake_up_all(&work->done);
+ }
if (!list_empty(&worker->work_list)) {
work = list_first_entry(&worker->work_list,
struct kthread_work, node);
list_del_init(&work->node);
- }
+ } else
+ work = NULL;
spin_unlock_irq(&worker->lock);

if (work) {
__set_current_state(TASK_RUNNING);
work->func(work);
- smp_wmb(); /* wmb worker-b0 paired with flush-b1 */
- work->done_seq = work->queue_seq;
- smp_mb(); /* mb worker-b1 paired with flush-b0 */
- if (atomic_read(&work->flushing))
- wake_up_all(&work->done);
} else if (!freezing(current))
schedule();

@@ -353,31 +355,33 @@ EXPORT_SYMBOL_GPL(queue_kthread_work);

/**
* flush_kthread_work - flush a kthread_work
+ * @worker: where work might be running
* @work: work to flush
*
* If @work is queued or executing, wait for it to finish execution.
*/
-void flush_kthread_work(struct kthread_work *work)
+void flush_kthread_work(struct kthread_worker *worker,
+ struct kthread_work *work)
{
- int seq = work->queue_seq;
+ int seq

- atomic_inc(&work->flushing);
-
- /*
- * mb flush-b0 paired with worker-b1, to make sure either
- * worker sees the above increment or we see done_seq update.
- */
- smp_mb__after_atomic_inc();
+ spin_lock_irq(&worker->lock);
+ seq = work->queue_seq;
+ ++work->flushing;
+ spin_unlock_irq(&worker->lock);

/* A - B <= 0 tests whether B is in front of A regardless of overflow */
- wait_event(work->done, seq - work->done_seq <= 0);
- atomic_dec(&work->flushing);
-
- /*
- * rmb flush-b1 paired with worker-b0, to make sure our caller
- * sees every change made by work->func().
- */
- smp_mb__after_atomic_dec();
+ wait_event(work->done,
+ ({
+ int done;
+ spin_lock_irq(&worker->lock);
+ delta = seq - work->done_seq <= 0;
+ spin_unlock_irq(&worker->lock);
+ done;
+ });
+ spin_lock_irq(&worker->lock);
+ --work->flushing;
+ spin_unlock_irq(&worker->lock);
}
EXPORT_SYMBOL_GPL(flush_kthread_work);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Michael S. Tsirkin on 26 Jul 2010 13:20

On Mon, Jul 26, 2010 at 06:05:27PM +0200, Tejun Heo wrote:
> Hello,
>
> On 07/26/2010 05:50 PM, Michael S. Tsirkin wrote:
> >> Hmmm... I'm not quite sure whether it's an optimization. I thought
> >> the patch was due to feeling uncomfortable about using barriers?
> >
> > Oh yes. But getting rid of barriers is what motivated me originally.
>
> Yeah, getting rid of barriers is always good. :-)
>
> > Is there a git tree with kthread_worker applied?
> > I might do this just for fun ...
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git for-next
>
> For the original implementaiton, please take a look at commit
> b56c0d8937e665a27d90517ee7a746d0aa05af46.
>
> * Can you please keep the outer goto repeat loop? I just don't like
> outermost for (;;).

Okay ... can we put the code in a {} scope to make it clear
where does the loop starts and ends?

> * Placing try_to_freeze() could be a bit annoying. It shouldn't be
> executed when there's a work to flush.

It currently seems to be executed when there is work to flush.
Is this wrong?

> * I think A - B <= 0 test would be more familiar. At least
> time_before/after() are implemented that way.

I am concerned that this overflows a signed integer -
which I seem to remeber that C99 disallows.
timer macros are on data path so might be worth the risk there,
but flush is slow path so better be safe?

> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Michael S. Tsirkin on 26 Jul 2010 16:10

On Mon, Jul 26, 2010 at 08:51:50PM +0200, Tejun Heo wrote:
> Hello,
>
> On 07/26/2010 06:31 PM, Michael S. Tsirkin wrote:
> >> On 07/26/2010 06:05 PM, Tejun Heo wrote:
> >>> * Placing try_to_freeze() could be a bit annoying. It shouldn't be
> >>> executed when there's a work to flush.
> >
> > BTW why is this important?
> > We could always get another work and flush right after
> > try_to_freeze, and then flush would block for a long time.
> >
> > BTW the vhost patch you sent does not do this at all.
> > I am guessing it is because our thread is not freezable?
>
> Yeap, I think so.
>
> >> * Similar issue exists for kthread_stop(). The kthread shouldn't exit
> >> while there's a work to flush (please note that kthread_worker
> >> interface allows detaching / attaching worker kthread during
> >> operation, so it should remain in consistent state with regard to
> >> flushing).
> >
> > Not sure I agree here. Users must synchronise flush and stop calls.
> > Otherwise a work might get queued after stop is called, and
> > you won't be able to flush it.
>
> For freeze, it probably is okay but for stop, I think it's better to
> keep the semantics straight forward.

What are the semantics then? What do we want stop followed
by queue and flush to do?

> It may be okay to do otherwise
> but having such oddity in generic interface is nasty and may lead to
> surprises which can be pretty difficult to track down later on. It's
> just a bit more of annoyance while writing the generic code, so...
>
> Thanks.
>
> --
> tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3
Prev: firewire: new driver: nosy - IEEE 1394 traffic sniffer
Next: Small perf optimization