bio too big - in nested raid setup [Kernel]

Prev: kernel/kfifo.c: fix "interger as NULL pointer" warning.
Next: mm/memcontrol.c: fix "integer as NULL pointer" warning.

From: Ing. Daniel Rozsnyó on 24 Jan 2010 14:00

Hello,
I am having troubles with nested RAID - when one array is added to
the other, the "bio too big device md0" messages are appearing:

bio too big device md0 (144 > 8)
bio too big device md0 (248 > 8)
bio too big device md0 (32 > 8)

From internet searches I've found no solution or error like mine,
just a note about data corruption when this is happening.

Description:

My setup is the following - one 2TB and four 500GB drives. The goal
is to have a mirror of the 2TB drive to a linear array of the other four
drives.

So.. the state without the error above is this:

# cat /proc/mdstat
Personalities : [linear] [raid0] [raid1] [raid10] [raid6] [raid5] [raid4]
md1 : active linear sdb1[0] sde1[3] sdd1[2] sdc1[1]
1953535988 blocks super 1.1 0k rounding

md0 : active raid1 sda2[0]
1953447680 blocks [2/1] [U_]
bitmap: 233/233 pages [932KB], 4096KB chunk

unused devices: <none>

With these block request sizes:

# cat /sys/block/md{0,1}/queue/max_{,hw_}sectors_kb
127
127
127
127

Now, I add the four drive array to the mirror - and the system starts
showing the bio error at any significant disk activity.. (probably
writes only). The reboot/shutdown process is full of these errors.

The step which messes up the system (ignore re-added, it happened the
very first time I've constructed the 4 drive array a hour ago):

# mdadm /dev/md0 --add /dev/md1
mdadm: re-added /dev/md1

# cat /sys/block/md{0,1}/queue/max_{,hw_}sectors_kb
4
4
127
127

The dmesg is just showing this:

md: bind<md1>
RAID1 conf printout:
--- wd:1 rd:2
disk 0, wo:0, o:1, dev:sda2
disk 1, wo:1, o:1, dev:md1
md: recovery of RAID array md0
md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
md: using maximum available idle IO bandwidth (but not more than 200000
KB/sec) for recovery.
md: using 128k window, over a total of 1953447680 blocks.

And as soon as a write occures to the array:

bio too big device md0 (40 > 8)

The removal of md1 from md0 does not help the situation, I need to
reboot the machine.

The md0 array bears LVM and inside it a root / swap / portage /
distfiles and home logical volumes.

My system is:

# uname -a
Linux desktop 2.6.32-gentoo-r1 #2 SMP PREEMPT Sun Jan 24 12:06:13 CET
2010 i686 Intel(R) Xeon(R) CPU X3220 @ 2.40GHz GenuineIntel GNU/Linux

Thanks for any help,

Daniel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Marti Raudsepp on 25 Jan 2010 10:30

2010/1/24 "Ing. Daniel Rozsny�" <daniel(a)rozsnyo.com>:
> Hello,
> �I am having troubles with nested RAID - when one array is added to the
> other, the "bio too big device md0" messages are appearing:
>
> bio too big device md0 (144 > 8)
> bio too big device md0 (248 > 8)
> bio too big device md0 (32 > 8)

I *think* this is the same bug that I hit years ago when mixing
different disks and 'pvmove'

It's a design flaw in the DM/MD frameworks; see comment #3 from Milan Broz:
http://bugzilla.kernel.org/show_bug.cgi?id=9401#c3

Regards,
Marti
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Milan Broz on 25 Jan 2010 13:30

On 01/25/2010 04:25 PM, Marti Raudsepp wrote:
> 2010/1/24 "Ing. Daniel Rozsny�" <daniel(a)rozsnyo.com>:
>> Hello,
>> I am having troubles with nested RAID - when one array is added to the
>> other, the "bio too big device md0" messages are appearing:
>>
>> bio too big device md0 (144 > 8)
>> bio too big device md0 (248 > 8)
>> bio too big device md0 (32 > 8)
>
> I *think* this is the same bug that I hit years ago when mixing
> different disks and 'pvmove'
>
> It's a design flaw in the DM/MD frameworks; see comment #3 from Milan Broz:
> http://bugzilla.kernel.org/show_bug.cgi?id=9401#c3

Hm. I don't think it is the same problem, you are only adding device to md array...
(adding cc: Neil, this seems to me like MD bug).

(original report for reference is here http://lkml.org/lkml/2010/1/24/60 )

Milan
--
mbroz(a)redhat.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Neil Brown on 27 Jan 2010 21:30

On Mon, 25 Jan 2010 19:27:53 +0100
Milan Broz <mbroz(a)redhat.com> wrote:

> On 01/25/2010 04:25 PM, Marti Raudsepp wrote:
> > 2010/1/24 "Ing. Daniel Rozsnyó" <daniel(a)rozsnyo.com>:
> >> Hello,
> >> I am having troubles with nested RAID - when one array is added to the
> >> other, the "bio too big device md0" messages are appearing:
> >>
> >> bio too big device md0 (144 > 8)
> >> bio too big device md0 (248 > 8)
> >> bio too big device md0 (32 > 8)
> >
> > I *think* this is the same bug that I hit years ago when mixing
> > different disks and 'pvmove'
> >
> > It's a design flaw in the DM/MD frameworks; see comment #3 from Milan Broz:
> > http://bugzilla.kernel.org/show_bug.cgi?id=9401#c3
>
> Hm. I don't think it is the same problem, you are only adding device to md array...
> (adding cc: Neil, this seems to me like MD bug).
>
> (original report for reference is here http://lkml.org/lkml/2010/1/24/60 )

No, I think it is the same problem.

When you have a stack of devices, the top level client needs to know the
maximum restrictions imposed by lower level devices to ensure it doesn't
violate them.
However there is no mechanism for a device to report that its restrictions
have changed.
So when md0 gains a linear leg and so needs to reduce the max size for
requests, there is no way to tell DM, so DM doesn't know. And as the
filesystem only asks DM for restrictions, it never finds out about the
new restrictions.

This should be fixed by having the filesystem not care about restrictions,
and the lower levels just split requests as needed, but that just hasn't
happened....

If you completely assemble md0 before activating the LVM stuff on top of it,
this should work.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Ing. Daniel Rozsnyó on 28 Jan 2010 04:30

Neil Brown wrote:
> On Mon, 25 Jan 2010 19:27:53 +0100
> Milan Broz <mbroz(a)redhat.com> wrote:
>
>> On 01/25/2010 04:25 PM, Marti Raudsepp wrote:
>>> 2010/1/24 "Ing. Daniel Rozsnyó" <daniel(a)rozsnyo.com>:
>>>> Hello,
>>>> I am having troubles with nested RAID - when one array is added to the
>>>> other, the "bio too big device md0" messages are appearing:
>>>>
>>>> bio too big device md0 (144 > 8)
>>>> bio too big device md0 (248 > 8)
>>>> bio too big device md0 (32 > 8)
>>> I *think* this is the same bug that I hit years ago when mixing
>>> different disks and 'pvmove'
>>>
>>> It's a design flaw in the DM/MD frameworks; see comment #3 from Milan Broz:
>>> http://bugzilla.kernel.org/show_bug.cgi?id=9401#c3
>> Hm. I don't think it is the same problem, you are only adding device to md array...
>> (adding cc: Neil, this seems to me like MD bug).
>>
>> (original report for reference is here http://lkml.org/lkml/2010/1/24/60 )
>
> No, I think it is the same problem.
>
> When you have a stack of devices, the top level client needs to know the
> maximum restrictions imposed by lower level devices to ensure it doesn't
> violate them.
> However there is no mechanism for a device to report that its restrictions
> have changed.
> So when md0 gains a linear leg and so needs to reduce the max size for
> requests, there is no way to tell DM, so DM doesn't know. And as the
> filesystem only asks DM for restrictions, it never finds out about the
> new restrictions.

Neil, why does it even reduce its block size? I've tried with both
"linear" and "raid0" (as they are the only way to get 2T from 4x500G)
and both behave the same (sda has 512, md0 127, linear 127 and raid0 has
512 kb block size).

I do not see the mechanism how 512:127 or 512:512 leads to 4 kb limit

Is it because:
- of rebuilding the array?
- of non-multiplicative max block size
- of non-multiplicative total device size
- of nesting?
- of some other fallback to 1 page?

I ask because I can not believe that a pre-assembled nested stack would
result in 4kb max limit. But I haven't tried yet (e.g. from a live cd).

The block device should not do this kind of "magic", unless the higher
layers support it. Which one has proper support then?
- standard partition table?
- LVM?
- filesystem drivers?

> This should be fixed by having the filesystem not care about restrictions,
> and the lower levels just split requests as needed, but that just hasn't
> happened....
>
> If you completely assemble md0 before activating the LVM stuff on top of it,
> this should work.
>
> NeilBrown

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

| Next | Last
Pages: 1 2
Prev: kernel/kfifo.c: fix "interger as NULL pointer" warning.
Next: mm/memcontrol.c: fix "integer as NULL pointer" warning.