From: Gui Jianfeng on
Vivek Goyal wrote:
> o Implement another CFQ mode where we charge group in terms of number
> of requests dispatched instead of measuring the time. Measuring in terms
> of time is not possible when we are driving deeper queue depths and there
> are requests from multiple cfq queues in the request queue.
>
> o This mode currently gets activated if one sets slice_idle=0 and associated
> disk supports NCQ. Again the idea is that on an NCQ disk with idling disabled
> most of the queues will dispatch 1 or more requests and then cfq queue
> expiry happens and we don't have a way to measure time. So start providing
> fairness in terms of IOPS.
>
> o Currently IOPS mode works only with cfq group scheduling. CFQ is following
> different scheduling algorithms for queue and group scheduling. These IOPS
> stats are used only for group scheduling hence in non-croup mode nothing
> should change.
>
> o For CFQ group scheduling one can disable slice idling so that we don't idle
> on queue and drive deeper request queue depths (achieving better throughput),
> at the same time group idle is enabled so one should get service
> differentiation among groups.
>
> Signed-off-by: Vivek Goyal <vgoyal(a)redhat.com>
> ---
> block/cfq-iosched.c | 30 ++++++++++++++++++++++++------
> 1 files changed, 24 insertions(+), 6 deletions(-)
>
> diff --git a/block/cfq-iosched.c b/block/cfq-iosched.c
> index c5ec2eb..9f82ec6 100644
> --- a/block/cfq-iosched.c
> +++ b/block/cfq-iosched.c
> @@ -378,6 +378,21 @@ CFQ_CFQQ_FNS(wait_busy);
> &cfqg->service_trees[i][j]: NULL) \
>
>
> +static inline bool iops_mode(struct cfq_data *cfqd)
> +{
> + /*
> + * If we are not idling on queues and it is a NCQ drive, parallel
> + * execution of requests is on and measuring time is not possible
> + * in most of the cases until and unless we drive shallower queue
> + * depths and that becomes a performance bottleneck. In such cases
> + * switch to start providing fairness in terms of number of IOs.
> + */
> + if (!cfqd->cfq_slice_idle && cfqd->hw_tag)
> + return true;
> + else
> + return false;
> +}
> +
> static inline enum wl_prio_t cfqq_prio(struct cfq_queue *cfqq)
> {
> if (cfq_class_idle(cfqq))
> @@ -905,7 +920,6 @@ static inline unsigned int cfq_cfqq_slice_usage(struct cfq_queue *cfqq)
> slice_used = cfqq->allocated_slice;
> }
>
> - cfq_log_cfqq(cfqq->cfqd, cfqq, "sl_used=%u", slice_used);
> return slice_used;
> }
>
> @@ -913,19 +927,21 @@ static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,
> struct cfq_queue *cfqq)
> {
> struct cfq_rb_root *st = &cfqd->grp_service_tree;
> - unsigned int used_sl, charge_sl;
> + unsigned int used_sl, charge;
> int nr_sync = cfqg->nr_cfqq - cfqg_busy_async_queues(cfqd, cfqg)
> - cfqg->service_tree_idle.count;
>
> BUG_ON(nr_sync < 0);
> - used_sl = charge_sl = cfq_cfqq_slice_usage(cfqq);
> + used_sl = charge = cfq_cfqq_slice_usage(cfqq);
>
> - if (!cfq_cfqq_sync(cfqq) && !nr_sync)
> - charge_sl = cfqq->allocated_slice;
> + if (iops_mode(cfqd))
> + charge = cfqq->slice_dispatch;

Hi Vivek,

At this time, requests may still stay in dispatch list, shall we add a new variable
in cfqq to keep track of the number of requests that go into driver, and charging
this number?

Thanks
Gui

> + else if (!cfq_cfqq_sync(cfqq) && !nr_sync)
> + charge = cfqq->allocated_slice;
>
> /* Can't update vdisktime while group is on service tree */
> cfq_rb_erase(&cfqg->rb_node, st);
> - cfqg->vdisktime += cfq_scale_slice(charge_sl, cfqg);
> + cfqg->vdisktime += cfq_scale_slice(charge, cfqg);
> __cfq_group_service_tree_add(st, cfqg);
>
> /* This group is being expired. Save the context */
> @@ -939,6 +955,8 @@ static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,
>
> cfq_log_cfqg(cfqd, cfqg, "served: vt=%llu min_vt=%llu", cfqg->vdisktime,
> st->min_vdisktime);
> + cfq_log_cfqq(cfqq->cfqd, cfqq, "sl_used=%u disp=%u charge=%u iops=%u",
> + used_sl, cfqq->slice_dispatch, charge, iops_mode(cfqd));
> cfq_blkiocg_update_timeslice_used(&cfqg->blkg, used_sl);
> cfq_blkiocg_set_start_empty_time(&cfqg->blkg);
> }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Vivek Goyal on
On Tue, Jul 27, 2010 at 01:47:39PM +0800, Gui Jianfeng wrote:

[..]
> > @@ -913,19 +927,21 @@ static void cfq_group_served(struct cfq_data *cfqd, struct cfq_group *cfqg,
> > struct cfq_queue *cfqq)
> > {
> > struct cfq_rb_root *st = &cfqd->grp_service_tree;
> > - unsigned int used_sl, charge_sl;
> > + unsigned int used_sl, charge;
> > int nr_sync = cfqg->nr_cfqq - cfqg_busy_async_queues(cfqd, cfqg)
> > - cfqg->service_tree_idle.count;
> >
> > BUG_ON(nr_sync < 0);
> > - used_sl = charge_sl = cfq_cfqq_slice_usage(cfqq);
> > + used_sl = charge = cfq_cfqq_slice_usage(cfqq);
> >
> > - if (!cfq_cfqq_sync(cfqq) && !nr_sync)
> > - charge_sl = cfqq->allocated_slice;
> > + if (iops_mode(cfqd))
> > + charge = cfqq->slice_dispatch;
>
> Hi Vivek,
>
> At this time, requests may still stay in dispatch list, shall we add a new variable
> in cfqq to keep track of the number of requests that go into driver, and charging
> this number?
>

Hi Gui,

How does that help. Even if request is in dispatch list, sooner or later
it will be dispatched. As long as we can make sure that requests in
dispatch list are in proportion to group weights, things should be just
fine.

Thanks
Vivek
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/