From: Jens Axboe on
On 2010-06-28 14:44, Christoph Hellwig wrote:
> On Mon, Jun 28, 2010 at 02:41:30PM +0200, Jens Axboe wrote:
>> The horrible part is working around that issue by fiddling with the
>> assignment of the internal vec. THAT looks like a horrible solution
>> to that problem.
>>
>> How about just adding a check to bio_has_data() for non-zero
>> bio->bi_vcnt?
>
> The question is how a discard request from the block layer should look
> like. With Mike's patch we have the same situation as for a barrier
> request: absolutely no data transferred and no indicator of it. IHMO
> that's much better than any partially constructed request. And yes,
> that means enabling the payload later in the driver.

With a barrier, it's more clear I think - if it carries data, then
you account that. If it's an empty barrier, then there's nothing to
account. There will be an impact on the io stream, but that is
indicated in blktrace for instance.

> The other option would be to not reuse the request at all and just
> allocate a new request and use that from sd_prep_fn. That's what
> I tried to implement first, but I couldn't get it to work. Given
> all the issue we have with the current approach I'm almost tempted
> to try that again.

That sounds way cleaner...

--
Jens Axboe

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mike Snitzer on
On Mon, Jun 28 2010 at 8:41am -0400,
Jens Axboe <axboe(a)kernel.dk> wrote:

> On 2010-06-28 14:37, Mike Snitzer wrote:
> > On Mon, Jun 28 2010 at 8:34am -0400,
> > Jens Axboe <axboe(a)kernel.dk> wrote:
> >
> >> On 2010-06-26 21:56, Mike Snitzer wrote:
> >>> Don't alloc discard bio with a biovec in blkdev_issue_discard. Doing so
> >>> means bio_has_data() will not be true until the SCSI layer adds the
> >>> payload to the discard request via blk_add_request_payload.
> >>
> >> Sorry, this looks horrible.
> >
> > Your judgment isn't giving me much to work with... not sure where I go
> > with "horrible".
>
> The horrible part is working around that issue by fiddling with the
> assignment of the internal vec. THAT looks like a horrible solution
> to that problem.
>
> How about just adding a check to bio_has_data() for non-zero
> bio->bi_vcnt?

Sure, that works.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: FUJITA Tomonori on
On Mon, 28 Jun 2010 08:29:55 -0400
Mike Snitzer <snitzer(a)redhat.com> wrote:

> On Mon, Jun 28 2010 at 6:33am -0400,
> FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote:
>
> > On Sat, 26 Jun 2010 15:56:51 -0400
> > Mike Snitzer <snitzer(a)redhat.com> wrote:
> >
> > > Don't alloc discard bio with a biovec in blkdev_issue_discard. Doing so
> > > means bio_has_data() will not be true until the SCSI layer adds the
> > > payload to the discard request via blk_add_request_payload.
> > >
> > > bio_{enable,disable}_inline_vecs are not expected to be widely used so
> > > they were exported using EXPORT_SYMBOL_GPL.
> > >
> > > This patch avoids the need for the following VM accounting fix for
> > > discards: http://lkml.org/lkml/2010/6/23/361
> >
> > Why do we need to avoid the above fix?
>
> We don't _need_ to. We avoid the need for it as a side-effect of the
> cleanup that my patch provides.
>
> > Surely, the above fix is hacky but much simpler than this patch.
>
> My patch wasn't meant as an alternative to Tao Ma's patch. Again, it
> just obviates the need for it.
>
> Your tolerance for "hacky" is difficult to understand. On the one-hand
> (PATCH 1/2) you have no tolerance for "hacky" fixes for leaks (that
> introduce a short-term SCSI layering violation).

Sorry, if not clear enough.

- SCSI layering violation is bad.

- A 'short term' solution always turns out to be a long solution. We
should have a clean solution from the start.

- Complicating the SCSI I/O completion is bad (already complicated
enough).

....

And the 'leaks' bug is still in -next. No need to fix it in a hacky
way. We can just drop it from -next.


> But in this case
> you're perfectly fine with BIO_RW_DISCARD special casing?

BIO_RW_DISCARD special is already everywhere in the block layer. I
prefer to have the less. However as long as it's in the block layer, I
can live with it. After all, that's the block layer thing.

At least, it looks much better this patch. This patch is really hacky
(as Jens said).
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Mike Snitzer on
On Mon, Jun 28 2010 at 11:15am -0400,
FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote:

> On Mon, 28 Jun 2010 08:29:55 -0400
> Mike Snitzer <snitzer(a)redhat.com> wrote:
>
> > On Mon, Jun 28 2010 at 6:33am -0400,
> > FUJITA Tomonori <fujita.tomonori(a)lab.ntt.co.jp> wrote:
> >
> > > On Sat, 26 Jun 2010 15:56:51 -0400
> > > Mike Snitzer <snitzer(a)redhat.com> wrote:
> > >
> > > > Don't alloc discard bio with a biovec in blkdev_issue_discard. Doing so
> > > > means bio_has_data() will not be true until the SCSI layer adds the
> > > > payload to the discard request via blk_add_request_payload.
> > > >
> > > > bio_{enable,disable}_inline_vecs are not expected to be widely used so
> > > > they were exported using EXPORT_SYMBOL_GPL.
> > > >
> > > > This patch avoids the need for the following VM accounting fix for
> > > > discards: http://lkml.org/lkml/2010/6/23/361
> > >
> > > Why do we need to avoid the above fix?
> >
> > We don't _need_ to. We avoid the need for it as a side-effect of the
> > cleanup that my patch provides.
> >
> > > Surely, the above fix is hacky but much simpler than this patch.
> >
> > My patch wasn't meant as an alternative to Tao Ma's patch. Again, it
> > just obviates the need for it.
> >
> > Your tolerance for "hacky" is difficult to understand. On the one-hand
> > (PATCH 1/2) you have no tolerance for "hacky" fixes for leaks (that
> > introduce a short-term SCSI layering violation).
>
> Sorry, if not clear enough.
>
> - SCSI layering violation is bad.
>
> - A 'short term' solution always turns out to be a long solution. We
> should have a clean solution from the start.
>
> - Complicating the SCSI I/O completion is bad (already complicated
> enough).
>
> ...
>
> And the 'leaks' bug is still in -next. No need to fix it in a hacky
> way. We can just drop it from -next.
>
>
> > But in this case
> > you're perfectly fine with BIO_RW_DISCARD special casing?
>
> BIO_RW_DISCARD special is already everywhere in the block layer. I
> prefer to have the less. However as long as it's in the block layer, I
> can live with it. After all, that's the block layer thing.
>
> At least, it looks much better this patch. This patch is really hacky
> (as Jens said).

Christoph more clearly conveyed the intent of my patch. Its focus was
_not_ to eliminate the need for Tao Ma's vm accounting patch.

I was attempting to have the SCSI layer more comprehensively manage the
allocation and use of biovec associated with the discard payload (that
the SCSI layer was now also managing rather than relying on the block
layer). It is as simple as that.

Berating me with "really hacky" critiques doesn't change the fact that
both the block layer _and_ the SCSI layer need serious help on their
implementation of discard support. The entirety of Linux's current
discard support is "really hacky".

I think we can all agree on that; so if any good came of the discussion
over the past 24 hours it is: we now know work is needed to make Linux's
discard support more capable (select few knew this, but many more are
aware of that fact now).

And the SCSI layer has a significant role in improving Linux's discard
capabilities. So relying on all discard changes to be in the block
layer isn't an option ;)

Regards,
Mike
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/