From: Mike Tomlinson on

Does anyone know anything about very big disks?

I have a Linux system with a large SCSI array connected. This is
displayed by the array as two 2.4TB disks. Linux sees /dev/sda and
/dev/sdb.

These seem ok:

[root(a)example eng]# /sbin/fdisk -l /dev/sda

Disk /dev/sda: 2098.1 GB, 2098118787072 bytes
255 heads, 63 sectors/track, 255081 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes

Device Boot Start End Blocks Id System
/dev/sda1 1 255081 2048938101 8e Linux LVM

with identical results for sdb. This has worked well for some months.

I have added two more arrays and used LVM to logically join them into
one volume of 5.6T which formatted and has been working well.

The new drives (sdc and sdd) I prepared using:

lvcreate -i2 -I4 -l1430410 -nLogVol02 VolGroup02 /dev/sdc1 /dev/sdd1
/dev/sdc2 /dev/sdd2

This failed after some time. When a physical media scan is ordered by
the RAID controller, there is no physical failure (e.g. no bad or non-
reassignable sectors.) I checked the physical and logical arrays using
the pv and lv tools which found no problems. The only thing that alerted
me to a possible problem was rsync, which indicated failures during
writing a backup. Nothing appears in the syslog, except:

Jan 5 14:31:40 example kernel: Aborting journal on device dm-2.
Jan 5 14:31:40 example kernel: sd 1:0:3:0: SCSI error: return code =
0x000b0000
Jan 5 14:31:40 example kernel: end_request: I/O error, dev sdd, sector
4540873733
Jan 5 14:31:40 example kernel: sd 1:0:3:0: SCSI error: return code =
0x000b0000
Jan 5 14:31:41 example kernel: end_request: I/O error, dev sdd, sector
4540876117
Jan 5 14:31:41 example kernel: ext3_abort called.
Jan 5 14:31:41 example kernel: EXT3-fs error (device dm-2):
ext3_journal_start_sb: Detected aborted journal
Jan 5 14:31:41 example kernel: Remounting filesystem read-only

What am I doing wrong? Is the big LVM volume too big for rsync or ext2fs
tools to cope with, or what? First time I've encountered anything like
this.

I'm on the point of giving up and using two separate 2TB partitions in
ext3.
What is the maximum size of a ext2 partition, is that the problem?

Thanks

--
Mike Tomlinson
From: anahata on
On Tue, 02 Mar 2010 11:42:19 +0000, Mike Tomlinson wrote:

> What is the maximum size of a ext2 partition, is that the problem?

Wikipedia says 2TiB if your block size is 1K.
http://en.wikipedia.org/wiki/Ext3

You can have bigger block sizes...

--
Anahata
anahata(a)treewind.co.uk ==//== 01638 720444
http://www.treewind.co.uk
From: Mike Tomlinson on
In article <hmj1n7$bu6$1(a)energise.enta.net>, Gordon Henderson
<gordon+usenet(a)drogon.net> writes
>In article <+L3X1jAbmPjLFw47(a)none.invalid>,
>Mike Tomlinson <none(a)none.invalid> wrote:
>
>>What am I doing wrong? Is the big LVM volume too big for rsync or ext2fs
>>tools to cope with, or what? First time I've encountered anything like
>>this.
>
>Been some years since I've played with multi TB systems, but the last big
>one I built has 15 x 500GB drives in it which I used Linux s/w RAID-6
>to combine into a single 6.5TB drive (under ext3) - which then ran for
>a few years as a remote off-site backup device, using rsync to a bunch
>of servers.

Doing something very similar. I only noticed a message from rsync quite
by chance:

rsync: writeid_unbuffered failed to write 4 bytes [sender: Broken pipe
(32)]
rsync: write failed on /mnt/backup "mnt/backup": No space left on
device (28)

What is really nasty is that there is NOTHING in the syslog. And I just
happened to have rsync running in verbose mode and was watching the
screen, otherwise I wouldn't have known about this. It normally runs
the backup silently from a script.

Kinda lost confidence in LVM now. The most recent HOWTO I could find is
little more than a list of available commands, but I need more than
that.

>I don't recall having to do anything special to make it work (other
>than swear, curse and generally get annoyed with Dell) - I did compile
>a customer kernel with all big big-file stuff ticked, but I imagine you
>have that in your system.

Yes

> This was running Debian Etch IIRC.

CentOS 5.2

>With a 4KB block size the max size of an ext2/3 partition is 16TB.

Ta.

>Could you have formatted it with a 1K block size? That would
>limit it to 2TB .

My thought too. Trouble is, it's all sorts of nonsense with external
SCSI arrays and warnings of the OS not being able to access them if they
exceed 2TB in size, deciding whether to use a GUID partition table or
not, and using the pv* and lv* tools to create the logical drive. The
machine is live and so I'm not able to take it down to fiddle, and I
can't get settled to sit and spend a good session with it, and and and
<grrr>

>mkfs usually picks the "right thing" though - but if you did mkfs.ext2
>by hand, or some external program, maybe it got it wrong?

I used the following line:
mke2fs -v -j -m 0 /dev/md0

The -j is all I need to make it ext3, right? No further prep needed?

I didn't specify blocks in mke2fs, but did in the LVM creation:
lvcreate -i 2 -I 4 -l 75155 -n LogVol02 VolGroup02 /dev/sdc1 /dev/sdc2

Don't think I trust this any more. I'll take the safe option and delete
the RAID partitions then reformat as standard ext3. I'll end up with
two partitions rather than one I wanted, but can work with that.

TYVM for your help (also to anahata)

--
Mike Tomlinson
From: Gordon Henderson on
In article <r0hiaWAd+QkLFwxK(a)none.invalid>,
Mike Tomlinson <none(a)none.invalid> wrote:
>In article <hmj1n7$bu6$1(a)energise.enta.net>, Gordon Henderson
><gordon+usenet(a)drogon.net> writes
>>In article <+L3X1jAbmPjLFw47(a)none.invalid>,
>>Mike Tomlinson <none(a)none.invalid> wrote:
>>
>>>What am I doing wrong? Is the big LVM volume too big for rsync or ext2fs
>>>tools to cope with, or what? First time I've encountered anything like
>>>this.
>>
>>Been some years since I've played with multi TB systems, but the last big
>>one I built has 15 x 500GB drives in it which I used Linux s/w RAID-6
>>to combine into a single 6.5TB drive (under ext3) - which then ran for
>>a few years as a remote off-site backup device, using rsync to a bunch
>>of servers.
>
>Doing something very similar. I only noticed a message from rsync quite
>by chance:
>
>rsync: writeid_unbuffered failed to write 4 bytes [sender: Broken pipe
>(32)]
>rsync: write failed on /mnt/backup "mnt/backup": No space left on
>device (28)
>
>What is really nasty is that there is NOTHING in the syslog. And I just
>happened to have rsync running in verbose mode and was watching the
>screen, otherwise I wouldn't have known about this. It normally runs
>the backup silently from a script.
>
>Kinda lost confidence in LVM now. The most recent HOWTO I could find is
>little more than a list of available commands, but I need more than
>that.

I tried LVM in the early days - got burnt, tried it brifly in the
LVM2 days, but didn't really see the point, do abandoned it and never
looked back.

However, I tend to build servers to a purpose and have rarely had the
need to re-size the disk array on a "live" box.

>>I don't recall having to do anything special to make it work (other
>>than swear, curse and generally get annoyed with Dell) - I did compile
>>a customer kernel with all big big-file stuff ticked, but I imagine you
>>have that in your system.
>
>Yes
>
>> This was running Debian Etch IIRC.
>
>CentOS 5.2
>
>>With a 4KB block size the max size of an ext2/3 partition is 16TB.
>
>Ta.
>
>>Could you have formatted it with a 1K block size? That would
>>limit it to 2TB .
>
>My thought too. Trouble is, it's all sorts of nonsense with external
>SCSI arrays and warnings of the OS not being able to access them if they
>exceed 2TB in size, deciding whether to use a GUID partition table or
>not, and using the pv* and lv* tools to create the logical drive. The
>machine is live and so I'm not able to take it down to fiddle, and I
>can't get settled to sit and spend a good session with it, and and and
><grrr>

Call me a luddite, but I much prefer to work directly with raw drives,
partition them, then use linux-raid to assemble them into arrays.

That Dell box was a pig and mis-sold to me - I wanted Just a Box of disks
with each disk device directly seeable to Linux. Dell told me that was
what I could have, but they lied, and to bypass their built-in RAID
controller, I had to create 15 single-drive RAID-0 units which Linux
then saw as sda though sdo.

>>mkfs usually picks the "right thing" though - but if you did mkfs.ext2
>>by hand, or some external program, maybe it got it wrong?
>
>I used the following line:
>mke2fs -v -j -m 0 /dev/md0
>
>The -j is all I need to make it ext3, right? No further prep needed?

Should be - I typically just use

mkfs -t ext3 /dev/mdX

Remember the tune2fs -i0 -c0 /dev/mdX too ...

>I didn't specify blocks in mke2fs, but did in the LVM creation:
>lvcreate -i 2 -I 4 -l 75155 -n LogVol02 VolGroup02 /dev/sdc1 /dev/sdc2

That's just the underlying block device - the blocks in the mkfs is a
separate quantity used by the filing system - it specifies the minimum
amount of disk space that a file can actually take up.

>Don't think I trust this any more. I'll take the safe option and delete
>the RAID partitions then reformat as standard ext3. I'll end up with
>two partitions rather than one I wanted, but can work with that.
>
>TYVM for your help (also to anahata)

I'm guessing there is an underlying RAID controller presenting these
2 "drives" to you? You could simply use Linux s/w RAID to combine them
using RAID-0 if that's all your after, rather than fiddling with LVM.

mdadm --create /dev/md0 -l0 -n2 /dev/sd{c,d}
mkfs -t ext3 /dev/md0
tune2fs -i0 -c0 /dev/md0

and off you go..

(Although I'd probably put a single full-size partition on /dev/sdc and
/dev/sdd though, but I can't remember why I think that's a better idea)

Gordon
From: Mike Tomlinson on
In article <hmr9be$uvo$1(a)energise.enta.net>, Gordon Henderson
<gordon+usenet(a)drogon.net> writes

>I tried LVM in the early days - got burnt, tried it brifly in the
>LVM2 days, but didn't really see the point

But it's the default on most distro installs! First time I saw it
(Fedora Core 1 maybe) I just thought "why?!" On a _single-disk_
workstation running the default CD-boot install...

>However, I tend to build servers to a purpose and have rarely had the
>need to re-size the disk array on a "live" box.

Yes. Long story. Not something to do on a Friday afternoon after a
lunchtime pint.

>Call me a luddite, but I much prefer to work directly with raw drives,
>partition them, then use linux-raid to assemble them into arrays.

Amen. The disks are actually external rack-mount chassis with their own
in-built controllers. A mixture of SCSI and IDE drives internally (not
on the same one!) but all have external SCSI presentation. They just
look to Linux like a big disk. The drives are hot-plug. I have half a
dozen to look after, all running RAID5 with a hot spare.

The main problem is that the biggest SCSI drive available is 300GB, so
the earlier arrays are starting to look a bit dated and cannot be
upgraded with bigger drives.

The maker does a 24-drive (SATA2 drives internally) unit. We need the
space badly, but I am waiting for the maker to qualify the unit with 2TB
drives before buying.

Because the fault tolerance is in the RAID chassis, I just use Linux
RAID to turn them into a Linux RAID0.

>That Dell box was a pig and mis-sold to me - I wanted Just a Box of disks
>with each disk device directly seeable to Linux. Dell told me that was
>what I could have, but they lied

salesweasels, huh.

I never liked Dell kit. Some of it was very cleverly designed, but.
>
>Should be - I typically just use
>
> mkfs -t ext3 /dev/mdX

I use -m 0 to maximise space (this is just for data storage, / is on its
own disk)

>That's just the underlying block device - the blocks in the mkfs is a
>separate quantity used by the filing system - it specifies the minimum
>amount of disk space that a file can actually take up.

That's part of the problem - the data to be preserved consists of many
tens of thousands of smallish files (10Mb ea compressed), which
filesystems do not like. Not all in one directory though :o)

>I'm guessing there is an underlying RAID controller presenting these
>2 "drives" to you?

Yes, hardware, in their rack enclosures.

> You could simply use Linux s/w RAID to combine them
>using RAID-0 if that's all your after

Yep, that's what I do :)

Why two different software RAID systems (LVM and mdX)? What's the
point? I thought LVM was the all-new dancing replacement for the old
mdX system.

And I would have to test it (mdX with a very big disk very carefully.

TVM again for the ideas. It's been good to talk to someone with a
similar setup.

And now pub beckons. Have a good weekend.

--
Mike Tomlinson