From: Vivek Goyal on
On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote:
> RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD,
> without further checks.
> RQ_NOIDLE can be used to mark the last request of a sequence for which
> - we want to idle between the requests of the sequence, to keep locality
> - we don't want to idle after the sequence, because we know that no new
> nearby requests will follow, so we should switch servicing other
> queues.

Corrado, in higher layers any WRITE_SYNC request currently is marked
as RQ_NOIDLE. At that point it is just not known whether there will be
another request after this or not. So I would not think of RQ_NOIDLE
as being conclusively telling us that this is last request in the
sequence.

I think requst being WRITE_SYNC, we just don't know if the application
is going to write more or not immediately. fsync, O_SYNC etc fall in
this category.

But in general I like the idea of getting rid of idling on as many cases
as possiblle. Jeff's recent posting to fix fsync issue depends on idling
even on WRITE_SYNC queues so your patch and his patchsets are
fundamentally incompatible.

Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
don't know the answer to that question. :-)). But in general I want to
get rid of idling as much as possible otherwise it becomes a serious
bottleneck in any kind of performance testing on higher end storage.

At the same time not idling runs the risk of process doing WRITE_SYNC
not getting fair share in presence of sequential readers if writer does
not keep the queue busy.

I will do some testing with this patchset little later.

Thanks
Vivek

> This patch fixes this behaviour, making it similar to how it behaved
> before 8e55063, but still fixing the corner cases that were the
> motivation for it.
>
> Signed-off-by: Corrado Zoccolo <czoccolo(a)gmail.com>
> ---
> block/cfq-iosched.c | 15 ++++++++++-----
> 1 files changed, 10 insertions(+), 5 deletions(-)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index 5ef9a5d..cac3afb 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
> cfqd->noidle_tree_requires_idle |= bitmask;
>
> /*
> - * Idling is enabled for SYNC_WORKLOAD.
> - * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
> - * only if we processed at least one !rq_noidle request
> + * Idling is enabled for:
> + * - the last sync queue of a group
> + * - SYNC_WORKLOAD queues, for !rq_noidle requests
> + * - SYNC_NOIDLE_WORKLOAD "at the end of the tree"
> + * if at least one queue sent !rq_noidle requests
> + * not followed by at least one rq_noidle request.
> */
> - if (cfqd->serving_type == SYNC_WORKLOAD
> - || cfqd->noidle_tree_requires_idle
> + if ((cfqd->serving_type == SYNC_WORKLOAD
> + && !rq_noidle(rq))
> + || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
> + && cfqd->noidle_tree_requires_idle)
> || cfqq->cfqg->nr_cfqq == 1)
> cfq_arm_slice_timer(cfqd);
> }
> --
> 1.6.4.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote:
> On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote:
> > RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD,
> > without further checks.
> > RQ_NOIDLE can be used to mark the last request of a sequence for which
> > - we want to idle between the requests of the sequence, to keep locality
> > - we don't want to idle after the sequence, because we know that no new
> > nearby requests will follow, so we should switch servicing other
> > queues.
>
> Corrado, in higher layers any WRITE_SYNC request currently is marked
> as RQ_NOIDLE. At that point it is just not known whether there will be
> another request after this or not. So I would not think of RQ_NOIDLE
> as being conclusively telling us that this is last request in the
> sequence.
>
> I think requst being WRITE_SYNC, we just don't know if the application
> is going to write more or not immediately. fsync, O_SYNC etc fall in
> this category.
>
> But in general I like the idea of getting rid of idling on as many cases
> as possiblle. Jeff's recent posting to fix fsync issue depends on idling
> even on WRITE_SYNC queues so your patch and his patchsets are
> fundamentally incompatible.
>
> Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> don't know the answer to that question. :-)). But in general I want to
> get rid of idling as much as possible otherwise it becomes a serious
> bottleneck in any kind of performance testing on higher end storage.
>
> At the same time not idling runs the risk of process doing WRITE_SYNC
> not getting fair share in presence of sequential readers if writer does
> not keep the queue busy.
>
> I will do some testing with this patchset little later.

Hmm..., noticed that you are still using Jens's old mail id. Fixing it.

Thanks
Vivek
>
> > This patch fixes this behaviour, making it similar to how it behaved
> > before 8e55063, but still fixing the corner cases that were the
> > motivation for it.
> >
> > Signed-off-by: Corrado Zoccolo <czoccolo(a)gmail.com>
> > ---
> > block/cfq-iosched.c | 15 ++++++++++-----
> > 1 files changed, 10 insertions(+), 5 deletions(-)
> >
> > diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> > index 5ef9a5d..cac3afb 100644
> > --- a/block/cfq-iosched.c
> > +++ b/block/cfq-iosched.c
> > @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
> > cfqd->noidle_tree_requires_idle |= bitmask;
> >
> > /*
> > - * Idling is enabled for SYNC_WORKLOAD.
> > - * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
> > - * only if we processed at least one !rq_noidle request
> > + * Idling is enabled for:
> > + * - the last sync queue of a group
> > + * - SYNC_WORKLOAD queues, for !rq_noidle requests
> > + * - SYNC_NOIDLE_WORKLOAD "at the end of the tree"
> > + * if at least one queue sent !rq_noidle requests
> > + * not followed by at least one rq_noidle request.
> > */
> > - if (cfqd->serving_type == SYNC_WORKLOAD
> > - || cfqd->noidle_tree_requires_idle
> > + if ((cfqd->serving_type == SYNC_WORKLOAD
> > + && !rq_noidle(rq))
> > + || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
> > + && cfqd->noidle_tree_requires_idle)
> > || cfqq->cfqg->nr_cfqq == 1)
> > cfq_arm_slice_timer(cfqd);
> > }
> > --
> > 1.6.4.4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Corrado Zoccolo on
On Wed, Jul 7, 2010 at 5:46 PM, Vivek Goyal <vgoyal(a)redhat.com> wrote:
> On Wed, Jul 07, 2010 at 05:23:47PM +0200, Corrado Zoccolo wrote:
>> RQ_NOIDLE flag is meaningful and should be honored for SYNC_WORKLOAD,
>> without further checks.
>> RQ_NOIDLE can be used to mark the last request of a sequence for which
>> - we want to idle between the requests of the sequence, to keep locality
>> - we don't want to idle after the sequence, because we know that no new
>>   nearby requests will follow, so we should switch servicing other
>>   queues.
>
> Corrado, in higher layers any WRITE_SYNC request currently is marked
> as RQ_NOIDLE. At that point it is just not known whether there will be
> another request after this or not. So I would not think of RQ_NOIDLE
> as being conclusively telling us that this is last request in the
> sequence.

Probably WRITE_SYNC are marked as RQ_NOIDLE because the application
can always send several write requests together (while for reads, you
usually need the result of one read to send the other). This means
that cfq will actually care only of the RQ_NOIDLE on the last request
in the queue (since, until the queue is empty, we don't even consider
idling).

>
> I think requst being WRITE_SYNC, we just don't know if the application
> is going to write more or not immediately. fsync, O_SYNC etc fall in
> this category.
>
> But in general I like the idea of getting rid of idling on as many cases
> as possiblle. Jeff's recent posting to fix fsync issue depends on idling
> even on WRITE_SYNC queues so your patch and his patchsets are
> fundamentally incompatible.

I think this can be easily fixed by removing RQ_NOIDLE from those
requests on which Jeff wants to idle. Once no more requests can ever
be marked RQ_NOIDLE, then we can remove this code completely.

>
> Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> don't know the answer to that question. :-)). But in general I want to
> get rid of idling as much as possible otherwise it becomes a serious
> bottleneck in any kind of performance testing on higher end storage.
>
> At the same time not idling runs the risk of process doing WRITE_SYNC
> not getting fair share in presence of sequential readers if writer does
> not keep the queue busy.
>
> I will do some testing with this patchset little later.
Thanks, I've resent the patches for 2.6.36 (this version were based on 2.6.34).
Corrado
>
> Thanks
> Vivek
>
>> This patch fixes this behaviour, making it similar to how it behaved
>> before 8e55063, but still fixing the corner cases that were the
>> motivation for it.
>>
>> Signed-off-by: Corrado Zoccolo <czoccolo(a)gmail.com>
>> ---
>>  block/cfq-iosched.c |   15 ++++++++++-----
>>  1 files changed, 10 insertions(+), 5 deletions(-)
>>
>> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
>> index 5ef9a5d..cac3afb 100644
>> --- a/block/cfq-iosched.c
>> +++ b/block/cfq-iosched.c
>> @@ -3356,12 +3356,17 @@ static void cfq_completed_request(struct request_queue *q, struct request *rq)
>>                               cfqd->noidle_tree_requires_idle |= bitmask;
>>
>>                       /*
>> -                      * Idling is enabled for SYNC_WORKLOAD.
>> -                      * SYNC_NOIDLE_WORKLOAD idles at the end of the tree
>> -                      * only if we processed at least one !rq_noidle request
>> +                      * Idling is enabled for:
>> +                      * - the last sync queue of a group
>> +                      * - SYNC_WORKLOAD queues, for !rq_noidle requests
>> +                      * - SYNC_NOIDLE_WORKLOAD "at the end of the tree"
>> +                      *   if at least one queue sent !rq_noidle requests
>> +                      *   not followed by at least one rq_noidle request.
>>                        */
>> -                     if (cfqd->serving_type == SYNC_WORKLOAD
>> -                         || cfqd->noidle_tree_requires_idle
>> +                     if ((cfqd->serving_type == SYNC_WORKLOAD
>> +                          && !rq_noidle(rq))
>> +                         || (cfqd->serving_type == SYNC_NOIDLE_WORKLOAD
>> +                             && cfqd->noidle_tree_requires_idle)
>>                           || cfqq->cfqg->nr_cfqq == 1)
>>                               cfq_arm_slice_timer(cfqd);
>>               }
>> --
>> 1.6.4.4
>



--
__________________________________________________________________________

dott. Corrado Zoccolo mailto:czoccolo(a)gmail.com
PhD - Department of Computer Science - University of Pisa, Italy
--------------------------------------------------------------------------
The self-confidence of a warrior is not the self-confidence of the average
man. The average man seeks certainty in the eyes of the onlooker and calls
that self-confidence. The warrior seeks impeccability in his own eyes and
calls that humbleness.
Tales of Power - C. Castaneda
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christoph Hellwig on
On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote:
> Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> don't know the answer to that question. :-)). But in general I want to
> get rid of idling as much as possible otherwise it becomes a serious
> bottleneck in any kind of performance testing on higher end storage.

After I've been thinking about this for a while I think the major
problems is that we use WRITE_SYNC for two very different I/O patterns.

One is synchronous data I/O (O_SYNC/O_DIRECT/fsync). While this is a
high-level synchronous workload in the sense that someone waits for the
I/O to finish, the I/O can still be batched as we're doing relatively
large amounts of bios.

The other one is synchronous writeout of metadata or the journal. Here
we typically wait on that single I/O we're just submitting (or at most a
handfull), and there is absolutely no point in idling.

We already have the REQ_NOIDLE flag to distinguish between the two, so
instead of second guessing we should actually make use of it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Wed, Jul 07, 2010 at 12:13:08PM -0400, Christoph Hellwig wrote:
> On Wed, Jul 07, 2010 at 11:46:31AM -0400, Vivek Goyal wrote:
> > Whether to idle on WRITE_SYNC or not, I will leave it to Jens (I just
> > don't know the answer to that question. :-)). But in general I want to
> > get rid of idling as much as possible otherwise it becomes a serious
> > bottleneck in any kind of performance testing on higher end storage.
>
> After I've been thinking about this for a while I think the major
> problems is that we use WRITE_SYNC for two very different I/O patterns.
>
> One is synchronous data I/O (O_SYNC/O_DIRECT/fsync). While this is a
> high-level synchronous workload in the sense that someone waits for the
> I/O to finish, the I/O can still be batched as we're doing relatively
> large amounts of bios.
>
> The other one is synchronous writeout of metadata or the journal.

Jeff Moyer had mentioned that in his testing journal writes from jbd
threads were appearing as asynchronous (WRITES) in CFQ and we don't do any
kind of idling in CFQ on asynchronous WRITES. So this is probably already
a non issue.

Thanks
Vivek

> Here
> we typically wait on that single I/O we're just submitting (or at most a
> handfull), and there is absolutely no point in idling.
>
> We already have the REQ_NOIDLE flag to distinguish between the two, so
> instead of second guessing we should actually make use of it.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/