From: Christof Schmitt on
When running tests with an ext2 filesystem on a device using DIF/DIX
integrity data, i sometimes see invalid guard tags on write requests.
To track down the problem, i patched the function sd_prep_fn in sd.c
to verify the IP checksums in the guard tags against the actual data.
Sometimes there is a mismatch and the write request fails when the HBA
checks the guard tag.

Since the guard tags are created in Linux, it seems that the data
attached to the write request changes between the generation in
bio_integrity_generate and the call to sd_prep_fn.

Using ext3 or ext4 instead of ext2 does not show the problem.

There is a bugzilla open at Redhat with the same symptom, but there is
no data or activity:
https://bugzilla.redhat.com/show_bug.cgi?id=574266

What would be the best way to track down this problem?

--
Christof Schmitt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christof Schmitt on
On Mon, May 31, 2010 at 01:28:17PM +0200, Christof Schmitt wrote:
> When running tests with an ext2 filesystem on a device using DIF/DIX
> integrity data, i sometimes see invalid guard tags on write requests.
> To track down the problem, i patched the function sd_prep_fn in sd.c
> to verify the IP checksums in the guard tags against the actual data.
> Sometimes there is a mismatch and the write request fails when the HBA
> checks the guard tag.
>
> Since the guard tags are created in Linux, it seems that the data
> attached to the write request changes between the generation in
> bio_integrity_generate and the call to sd_prep_fn.
>
> Using ext3 or ext4 instead of ext2 does not show the problem.
>
> There is a bugzilla open at Redhat with the same symptom, but there is
> no data or activity:
> https://bugzilla.redhat.com/show_bug.cgi?id=574266
>
> What would be the best way to track down this problem?

One more thing: The test is running with a 2.6.34 kernel, the problem
in the bugzilla is reported for 2.6.33.

Christof Schmitt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Martin K. Petersen on
>>>>> "Christof" == Christof Schmitt <christof.schmitt(a)de.ibm.com> writes:

Christof> Since the guard tags are created in Linux, it seems that the
Christof> data attached to the write request changes between the
Christof> generation in bio_integrity_generate and the call to
Christof> sd_prep_fn.

Yep, known bug. Page writeback locking is messed up for buffer_head
users. The extNfs folks volunteered to look into this a while back but
I don't think they have found the time yet.


Christof> Using ext3 or ext4 instead of ext2 does not show the problem.

Last I looked there were still code paths in ext3 and ext4 that
permitted pages to be changed during flight. I guess you've just been
lucky.

--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Christof Schmitt on
On Mon, May 31, 2010 at 10:20:44AM -0400, Martin K. Petersen wrote:
> >>>>> "Christof" == Christof Schmitt <christof.schmitt(a)de.ibm.com> writes:
>
> Christof> Since the guard tags are created in Linux, it seems that the
> Christof> data attached to the write request changes between the
> Christof> generation in bio_integrity_generate and the call to
> Christof> sd_prep_fn.
>
> Yep, known bug. Page writeback locking is messed up for buffer_head
> users. The extNfs folks volunteered to look into this a while back but
> I don't think they have found the time yet.

Thanks for the info. This means that this bug appears with all
filesystems?

>
>
> Christof> Using ext3 or ext4 instead of ext2 does not show the problem.
>
> Last I looked there were still code paths in ext3 and ext4 that
> permitted pages to be changed during flight. I guess you've just been
> lucky.

ext3 looks good so far. I see the problem also with ext4, so i spoke
too early on that one. I will start a longer testrun with ext3 to see
if and when the problem appears with ext3 in my setup.

--
Christof Schmitt
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
From: Nick Piggin on
On Mon, May 31, 2010 at 10:20:44AM -0400, Martin K. Petersen wrote:
> >>>>> "Christof" == Christof Schmitt <christof.schmitt(a)de.ibm.com> writes:
>
> Christof> Since the guard tags are created in Linux, it seems that the
> Christof> data attached to the write request changes between the
> Christof> generation in bio_integrity_generate and the call to
> Christof> sd_prep_fn.
>
> Yep, known bug. Page writeback locking is messed up for buffer_head
> users. The extNfs folks volunteered to look into this a while back but
> I don't think they have found the time yet.

What do you mean by messed up? Allowing modifications to the page while
it is under writeback? This is deliberate of course and not limited to
buffer_head users either.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/