From: tytso on
On Tue, Jun 15, 2010 at 12:43:17AM +0200, Tejun Heo wrote:
>
> Well, basics of the whole thing didn't change all that much since the
> first take and most people on cc list were cc'd on each take. The
> biggest reason I'm still carrying the whole patchset is due to the
> scheduler changes. The numbers are in the third take (which you can
> follow the links to find out). Anyways, I'll write up another summary
> tomorrow.

It really helps if patch summaries are self contained and don't
require a bunch of kernel developers who are trying to review things
to have to do research and then figure out which links are the right
ones to chase down. It's also not reasonable to expect your reviewers
to diff your patches to determine how much has changed and whether
they should expect benchmarks run from months ago to still be
applicable or not.

Many of us get literally hundreds of e-mail messages a day, and
e-mails are read with one finger hovering over the the 'd' key. It
simply scales better if you don't assume that everybody else considers
the patch as important as you do, and instead assume that most people
have forgotten patches sent months ago....

Regards,

- Ted
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stefan Richter on
Tejun Heo wrote:
> This is the fifth take of cmwq (concurrency managed workqueue)
> patchset. It's on top of v2.6.35-rc3 + sched/core patches. Git tree
> is available at
>
> git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-cmwq

A comment and a question:

As a driver maintainer, I would find it helpful if the WQ_flags in
include/linux/workqueue.h and/or __create_workqueue_key() in
kernel/workqueue.c (or its wrappers in include/linux/workqueue.h) were
better documented.

How about the global workqueue, i.e. schedule_work() and friends? At
your current review-cmwq head, they use system_wq, not system_nrt_wq.
But doesn't have the present global workqueue WQ_NON_REENTRANT
semantics? In fact, don't have _all_ workqueues WQ_NON_REENTRANT
semantics presently? If so, a good deal of existing users probably
relies on non-reentrant behaviour. Or am I thoroughly misunderstanding
the meaning of WQ_NON_REENTRANT?

(Sorry if this had been discussed before; I followed the discussions of
some of your previous submissions but not all. And PS, I am eagerly
awaiting for this to go into the mainline.)
--
Stefan Richter
-=====-==-=- -==- -====
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 06/15/2010 08:29 PM, Stefan Richter wrote:
> Tejun Heo wrote:
>> This is the fifth take of cmwq (concurrency managed workqueue)
>> patchset. It's on top of v2.6.35-rc3 + sched/core patches. Git tree
>> is available at
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq.git review-cmwq
>
> A comment and a question:
>
> As a driver maintainer, I would find it helpful if the WQ_flags in
> include/linux/workqueue.h and/or __create_workqueue_key() in
> kernel/workqueue.c (or its wrappers in include/linux/workqueue.h) were
> better documented.

Sure, it can definitely be improved.

> How about the global workqueue, i.e. schedule_work() and friends? At
> your current review-cmwq head, they use system_wq, not system_nrt_wq.
> But doesn't have the present global workqueue WQ_NON_REENTRANT
> semantics? In fact, don't have _all_ workqueues WQ_NON_REENTRANT
> semantics presently? If so, a good deal of existing users probably
> relies on non-reentrant behaviour. Or am I thoroughly misunderstanding
> the meaning of WQ_NON_REENTRANT?

Yeah, it's a bit confusing. :-( The current workqueue semantics is
non-reentrant on the same cpu but reentrant on different cpus.
WQ_NON_REENTRANT is non-reentrant regardless of cpu, so it's stronger
guarantee than before. To summarize,

current MT == !WQ_NON_REENTRANT < WQ_NON_REENTRANT <
WQ_SINGLE_CPU < current ST == WQ_SINGLE_CPU + max in_flight of 1.

> (Sorry if this had been discussed before; I followed the discussions of
> some of your previous submissions but not all. And PS, I am eagerly
> awaiting for this to go into the mainline.)

Ah, yeah, after ten month, I'm pretty eager too. :-)

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Stefan Richter on
Andrew Morton wrote:
> On Mon, 14 Jun 2010 23:37:17 +0200
> Tejun Heo <tj(a)kernel.org> wrote:
>
>> This is the fifth take of cmwq (concurrency managed workqueue)
>> patchset.
>
> What is a concurrency managed workqueue and why do we want one?

From what I understood, this is about the following:

- Right now, a workqueue is backed by either 1 or by #_of_CPUs
kernel threads. There is no other option.

- To avoid creating half a million of kernel threads, driver authors
resort to either
- using the globally shared workqueue even if they might queue
high-latency work in corner cases,
or
- creating a single-threaded workqueue even if they put unrelated
jobs into that queue that should better be executed in
parallel, not serially.
(I for one have both cases in drivers/firewire/, and I have similar
issues in the old drivers/ieee1394/.)

The cmwq patch series reforms workqueues to be backed by a global thread
pool. Hence:

+ Driver authors can and should simply register one queue for any one
purpose now. They don't need to worry anymore about having too many
or too few backing threads.

+ [A side effect: In some cases, a driver that currently uses a
thread pool can be simplified by migrating to the workqueue API.]

Tejun, please correct me if I misunderstood.
--
Stefan Richter
-=====-==-=- -==- -====
http://arcgraph.de/sr/
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Tejun Heo on
Hello,

On 06/15/2010 08:15 PM, Stefan Richter wrote:
>>From what I understood, this is about the following:
>
> - Right now, a workqueue is backed by either 1 or by #_of_CPUs
> kernel threads. There is no other option.
>
> - To avoid creating half a million of kernel threads, driver authors
> resort to either
> - using the globally shared workqueue even if they might queue
> high-latency work in corner cases,
> or
> - creating a single-threaded workqueue even if they put unrelated
> jobs into that queue that should better be executed in
> parallel, not serially.
> (I for one have both cases in drivers/firewire/, and I have similar
> issues in the old drivers/ieee1394/.)
>
> The cmwq patch series reforms workqueues to be backed by a global thread
> pool. Hence:
>
> + Driver authors can and should simply register one queue for any one
> purpose now. They don't need to worry anymore about having too many
> or too few backing threads.

Wq now serves more as a flushing and max-inflight controlling domain,
so unless it needs to flush the workqueue (as opposed to each work) or
throttle max-inflight or might be used during allocation path (in
which case emergency worker should also be used), default wq should
work fine too.

> + [A side effect: In some cases, a driver that currently uses a
> thread pool can be simplified by migrating to the workqueue API.]
>
> Tejun, please correct me if I misunderstood.

Yeap, from driver's POV, mostly precise. The reason I started this
whole thing is that I was trying to implement in-kernel media presence
polling (mostly for cdroms but may also useful for polling other stuff
for other types of devices) and I got immediately stuck on how to
manage concurrency.

I can create single kthread per drive which should work fine in most
cases but there are configurations with a lot of devices and it's not
only wasteful but might actually cause scalability issues. For most
common cases, ST or MT wq could be enough but then again when
something gets stuck (unfortunately somewhat common with cheap
drives), the whole thing will get stuck. So, I was thinking about
creating a worker pool for it and managing concurrency, which felt
very silly. I just needed some context to host those pollings on
demand and this is not something I should be worrying about when I'm
trying to implement media presence polling.

I think there are many similar situations for drivers. I already
wrote about libata but it's just silly to worry about how to manage
execution contexts for polling PIO, EH and hotplug from individual
drivers and drivers often have to make suboptimal choices because it's
not worth solving fully at that layer. So, cmwq tries to provide an
easy way to get hold of execution contexts on demand.

Thanks.

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/