Linux mdadm superblock question. [Kernel]

Prev: [patch v3 0/2] updated ptrace/core-dump patches for supporting xstate - v3
Next: [PATCH 3/3] mm: Debugging of new livelock avoidance

From: Mr. James W. Laferriere on 14 Feb 2010 22:50

Hello All ,

On Mon, 15 Feb 2010, Rudy Zijlstra wrote:
> H. Peter Anvin wrote:
>> In Fedora 12, for example, Dracut tries to make the distinction between
>> whole RAID device and a partition device, and utterly fails -- often
>> resulting in data loss.
>>
> i do not use Fedora/redhat and do not intent to ever try them again... still,
> the point is valid
>> With a pointer to the beginning this would have been a trivial thing to
>> detect.
>>
>> IMO it would make sense to support autoassemble for 1.0 superblocks, and
>> making them the default. The purpose would be to get everyone off 0.9.
>> However, *any* default is better than 1.1.
>> -hpa
> As long is autodetect is supported in the kernel, i am willing to upgrade to
> 1.0 superblocks. BUT i need the autodetect in the kernel, as i refuse to use
> initrd for production servers.
> Cheers,
> Rudy
I also have to agree with Rudy in this matter .

Tia , JimL
--
+------------------------------------------------------------------+
| James W. Laferriere | System Techniques | Give me VMS |
| Network&System Engineer | 3237 Holden Road | Give me Linux |
| babydr(a)baby-dragons.com | Fairbanks, AK. 99709 | only on AXP |
+------------------------------------------------------------------+
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: martin f krafft on 15 Feb 2010 04:20

also sprach Gabor Gombas <gombasg(a)digikabel.hu> [2010.02.15.1028 +1300]:
> There is no autodetection with 1.1. Once you have mdadm.conf you have
> pretty hard rules about what to look for and how to assemble it - ie.
> there is not much left to "auto" detect. Real autodetection would mean
> there is _no_ such information available, and you figure out everything
> by just looking at the devices you find.

Which, coincidentally, is where we're heading with incremental
assembly. Check the Debian experimental package if you want to try.

--
martin | http://madduck.net/ | http://two.sentenc.es/

"we should have a volleyballocracy.
we elect a six-pack of presidents.
each one serves until they screw up,
at which point they rotate."
-- dennis miller

spamtraps: madduck.bogus(a)madduck.net

From: Neil Brown on 15 Feb 2010 19:30

On Sat, 13 Feb 2010 11:58:03 -0800
"H. Peter Anvin" <hpa(a)zytor.com> wrote:

> On 02/11/2010 05:52 PM, Michael Evans wrote:
> > On Thu, Feb 11, 2010 at 3:00 PM, Justin Piszcz <jpiszcz(a)lucidpixels.com> wrote:
> >> Hi,
> >>
> >> I may be converting a host to ext4 and was curious, is 0.90 still the only
> >> superblock version for mdadm/raid-1 that you can boot from without having to
> >> create an initrd/etc?
> >>
> >> Are there any benefits to using a superblock > 0.90 for a raid-1 boot volume
> >> < 2TB?
> >>
> >> Justin.
> >> --
> >> To unsubscribe from this list: send the line "unsubscribe linux-raid" in
> >> the body of a message to majordomo(a)vger.kernel.org
> >> More majordomo info at http://vger.kernel.org/majordomo-info.html
> >>
> >
> > You need the superblock at the end of the partition: If you read the
> > manual that is clearly either version 0.90 OR 1.0 (NOT 1.1 and also
> > NOT 1.2; those use the same superblock layout but different
> > locations).
>
> 0.9 has the *serious* problem that it is hard to distinguish a whole-volume
>
> However, apparently mdadm recently switched to a 1.1 default. I
> strongly urge Neil to change that to either 1.0 and 1.2, as I have
> started to get complaints from users that they have made RAID volumes
> with newer mdadm which apparently default to 1.1, and then want to boot
> from them (without playing MBR games like Grub does.) I have to tell
> them that they have to regenerate their disks -- the superblock occupies
> the boot sector and there is nothing I can do about it. It's the same
> pathology XFS has.

When mdadm defaults to 1.0 for a RAID1 it prints a warning to the effect that
the array might not be suitable to store '/boot', and requests confirmation.

So I assume that the people who are having this problem either do not read,
or are using some partitioning tool that runs mdadm under the hood using
"--run" to avoid the need for confirmation. It would be nice to confirm if
that was the case, and find out what tool is being used.

If an array is not being used for /boot (or /) then I still think that 1.1 is
the better choice as it removes the possibility for confusion over partition
tables.

I guess I could try defaulting to 1.2 in a partition, and 1.1 on a
whole-device. That might be a suitable compromise.

How do people cope with XFS??

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: Neil Brown on 15 Feb 2010 20:00

On Thu, 11 Feb 2010 18:00:23 -0500 (EST)
Justin Piszcz <jpiszcz(a)lucidpixels.com> wrote:

> Hi,
>
> I may be converting a host to ext4 and was curious, is 0.90 still the only
> superblock version for mdadm/raid-1 that you can boot from without having
> to create an initrd/etc?
>
> Are there any benefits to using a superblock > 0.90 for a raid-1 boot
> volume < 2TB?

The only noticeable differences that I can think of are:
1/ If you reboot during recovery of a spare, then 0.90 will restart the
recovery at the start, while 1.x will restart from where it was up to.
2/ The /sys/class/block/mdXX/md/dev-YYY/errors counter is reset on each
re-assembly with 0.90, but is preserved across stop/start with 1.x
3/ If your partition starts on a multiple of 64K from the start of the
device and is the last partition and contains 0.90 metadata, then
mdadm can get confused by it.
4/ If you move the devices to a host with a different arch and different
byte-ordering, then extra effort will be needed to see the array for
0.90, but not for 1.x

I suspect none of these is a big issue.

It is likely that future extensions will only be supported on 1.x metadata.
For example I hope to add support for storing a bad-block list, so that a
read error during recovery will only be fatal for that block, not the whole
recovery process. This is unlikely ever to be supported on 0.90. However
it may not be possible to hot-enable it on 1.x either, depending on how much
space has been reserved for extra metadata, so there is no guarantee that
using 1.x now makes you future-proof.

And yes, 0.90 is still the only superblock version that supports in-kernel
autodetect, and I have no intention of adding in-kernel autodetect for any
other version.

NeilBrown
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

From: H. Peter Anvin on 15 Feb 2010 20:30

On 02/15/2010 04:27 PM, Neil Brown wrote:
>
> When mdadm defaults to 1.0 for a RAID1 it prints a warning to the effect that
> the array might not be suitable to store '/boot', and requests confirmation.
>
> So I assume that the people who are having this problem either do not read,
> or are using some partitioning tool that runs mdadm under the hood using
> "--run" to avoid the need for confirmation. It would be nice to confirm if
> that was the case, and find out what tool is being used.

My guess is that they are using the latter. However, some of it is
probably also a matter of not planning ahead, or not understanding the
error message. I'll forward one email privately (don't want to forward
a private email to a list.)

> If an array is not being used for /boot (or /) then I still think that 1.1 is
> the better choice as it removes the possibility for confusion over partition
> tables.
>
> I guess I could try defaulting to 1.2 in a partition, and 1.1 on a
> whole-device. That might be a suitable compromise.

In some ways, 1.1 is even more toxic on a whole-device, since that means
that it is physically impossible to boot off of it -- the hardware will
only ever read the first sector (MBR).

> How do people cope with XFS??

There are three options:

a) either don't boot from it (separate /boot);
b) use a bootloader which installs in the MBR and
hopefully-unpartitioned disk areas (e.g. Grub);
c) use a nonstandard custom MBR.

Neither (b) or (c), of course, allow for chainloading from another OS
install and thus are bad for interoperability.

-hpa

--
H. Peter Anvin, Intel Open Source Technology Center
I work for Intel. I don't speak on their behalf.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo(a)vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

First | Prev | Next | Last
Pages: 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
Prev: [patch v3 0/2] updated ptrace/core-dump patches for supporting xstate - v3
Next: [PATCH 3/3] mm: Debugging of new livelock avoidance