From: Jamie Lokier on
Changli Gao wrote:
> On Wed, Apr 28, 2010 at 5:29 PM, David Howells <dhowells(a)redhat.com> wrote:
> > Changli Gao <xiaosuo(a)gmail.com> wrote:
> >
> >> If there isn't enough work to be done, we'd better not disrupt them
> >> and �leave them sleeping forever to keep the scheduler happier. Do we
> >> have reason to keep fair to all the workers? Does it have benefit?
> >
> > You've made one important assumption: the processes on the wait queue are
> > sleeping waiting to service things... but what if the wait queue governs
> > access to a resource, and all the processes on that wait queue need access to
> > that resource to do things? �Some of the processes waiting for it may never
> > get a go, and so necessary work may be left undone.
> >
>
> You are right. I made the wrong assumption. But we indeed need some
> primitive to add wait_queue at the head of the wait_queue_head, and I
> know epoll needs it, at least.
>
> fs/eventpoll.c: 1443.
> wait.flags |= WQ_FLAG_EXCLUSIVE;
> __add_wait_queue(&ep->wq, &wait);

The same thing about assumptions applies here. The userspace process
may be waiting for an epoll condition to get access to a resource,
rather than being a worker thread interchangeable with others.

For example, userspace might be using a pipe as a signal-safe lock, or
signal-safe multi-token semaphore, and epoll to wait for that pipe.

WQ_FLAG_EXCLUSIVE means there is no point waking all tasks, to avoid a
pointless thundering herd. It doesn't mean unfairness is ok.

The LIFO idea _might_ make sense for interchangeable worker-thread
situations - including userspace. It would make sense for pipe
waiters, socket waiters (especially accept), etc.

Do you have any measurements which showing the LIFO mode performing
better than FIFO, and by how much?

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: David Howells on
Changli Gao <xiaosuo(a)gmail.com> wrote:

> @@ -50,6 +48,7 @@ struct wait_bit_queue {
> struct __wait_queue_head {
> spinlock_t lock;
> struct list_head task_list;
> + struct list_head task_list_ex;

It would be preferable it if you could avoid making struct __wait_queue_head
bigger. That will increase the size of a lot of things.

David
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Changli Gao on
On Wed, Apr 28, 2010 at 5:34 PM, David Howells <dhowells(a)redhat.com> wrote:
>
> Can you split the wait queue code and differentiate exclusive wait queues with
> LIFO functionality from wait queues with FIFO functionality.  I can see why
> your suggestion is desirable.
>

OK. I will use two lists: one for non-exclusive wait queues, and the
other for exclusive wait queues, and I'll add new interfaces for LIFO
functionality instead of changing the current interfaces.

--
Regards,
Changli Gao(xiaosuo(a)gmail.com)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Changli Gao on
On Wed, Apr 28, 2010 at 9:21 PM, Jamie Lokier <jamie(a)shareable.org> wrote:
> Changli Gao wrote:
>>
>> fs/eventpoll.c: 1443.
>>                 wait.flags |= WQ_FLAG_EXCLUSIVE;
>>                 __add_wait_queue(&ep->wq, &wait);
>
> The same thing about assumptions applies here.  The userspace process
> may be waiting for an epoll condition to get access to a resource,
> rather than being a worker thread interchangeable with others.

Oh, the lines above are the current ones. So the assumptions applies
and works here.

>
> For example, userspace might be using a pipe as a signal-safe lock, or
> signal-safe multi-token semaphore, and epoll to wait for that pipe.
>
> WQ_FLAG_EXCLUSIVE means there is no point waking all tasks, to avoid a
> pointless thundering herd.  It doesn't mean unfairness is ok.

The users should not make any assumption about the waking up sequence,
neither LIFO nor FIFO.

>
> The LIFO idea _might_ make sense for interchangeable worker-thread
> situations - including userspace.  It would make sense for pipe
> waiters, socket waiters (especially accept), etc.

Yea, and my following patches are for socket waiters.

>
> Do you have any measurements which showing the LIFO mode performing
> better than FIFO, and by how much?
>

I didn't do any test yet. But some work done by LSE project years ago
showed that it is better.

http://lse.sourceforge.net/io/aionotes.txt

" Also in view of
better cache utilization the wake queue mechanism is LIFO by default.
(A new exclusive LIFO wakeup option has been introduced for this purpose)"

--
Regards,
Changli Gao(xiaosuo(a)gmail.com)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Changli Gao on
On Wed, Apr 28, 2010 at 5:32 PM, David Howells <dhowells(a)redhat.com> wrote:
> Changli Gao <xiaosuo(a)gmail.com> wrote:
>
>> @@ -50,6 +48,7 @@ struct wait_bit_queue {
>>  struct __wait_queue_head {
>>       spinlock_t lock;
>>       struct list_head task_list;
>> +     struct list_head task_list_ex;
>
> It would be preferable it if you could avoid making struct __wait_queue_head
> bigger.  That will increase the size of a lot of things.
>

I don't know how to do that, as maybe there are non-exclusive and
exclusive wait queues in the same wait queue head. If we want to
enqueue exclusive wait queues at the head of exclusive queues, we have
to know where the head is, otherwise, we have to loop to find the head
when enqueuing.

--
Regards,
Changli Gao(xiaosuo(a)gmail.com)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/