From: Jens Axboe on
On 07/13/2010 05:14 PM, Jeff Moyer wrote:
> Hi,
>
> In running a test case that tries to trip up the kernel's AIO
> implementation, we ran into a situation where no other I/O to the device
> under test would be completed. The test program spawned (in this case)
> 100 threads, each of which performed the following in a loop:
>
> open file O_DIRECT
> queue 1MB of read I/O from file using 16 iocbs
> close file
> repeat
>
> The program does NOT wait for the I/O to complete. The file length is
> only 4MB, meaning that you have 25 threads performing I/O on each of the
> 4 1MB regions.
>
> Both deadline and cfq check for aliased requests in the sorted list of
> I/Os, and when an alias is found, the request in the rb tree is moved to
> the dispatch list. So, what happens is that, with this workload, only
> requests from this program are moved to the dispatch list, starving out
> all other I/O.
>
> I've attached a patch that fixes this using one approach. The patch
> just implements a counter for aliased requests served in sequence, and
> dispatches from the fifo once the number of aliased requests serviced
> exceeds a certain value. What that value should be set to is difficult
> to determine. This should be a rare case, and you don't really want to
> go outside of the normal scheduling mechanism for I/Os. So, there is a
> balance to be struck between dispatching requests in the alised request
> code path and allowing this starvation to go on, hoping it is a
> temporary situation (but ensuring progress).
>
> Another approach I've implemented gets rid of the counter, and simply
> checks the fifo when dispatching an aliased request.
>
> Jens, let me know which approach is more palatable and I'll spin up the
> appropriate patch for CFQ.

How about just skipping the counter, and always checking the fifo if
we have served one alias? I would prefer that, no need to make it
more complex than we have to.

It's one of those things that you think of when adding the alias
support, but don't envision every causing any issues in the real
world. But lets fix it, it definitely can happen as you demonstrated.

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/