From: Nigel Wade on
On Fri, 05 Mar 2010 14:05:17 +0000, Mike Tomlinson wrote:

> In article <hmj1n7$bu6$1(a)energise.enta.net>, Gordon Henderson
> <gordon+usenet(a)drogon.net> writes
>>In article <+L3X1jAbmPjLFw47(a)none.invalid>, Mike Tomlinson
>><none(a)none.invalid> wrote:
>>
>>>What am I doing wrong? Is the big LVM volume too big for rsync or
>>>ext2fs tools to cope with, or what? First time I've encountered
>>>anything like this.
>>
>>Been some years since I've played with multi TB systems, but the last
>>big one I built has 15 x 500GB drives in it which I used Linux s/w
>>RAID-6 to combine into a single 6.5TB drive (under ext3) - which then
>>ran for a few years as a remote off-site backup device, using rsync to a
>>bunch of servers.
>
> Doing something very similar. I only noticed a message from rsync quite
> by chance:
>
> rsync: writeid_unbuffered failed to write 4 bytes [sender: Broken pipe
> (32)]
> rsync: write failed on /mnt/backup "mnt/backup": No space left on
> device (28)
>
> What is really nasty is that there is NOTHING in the syslog. And I just
> happened to have rsync running in verbose mode and was watching the
> screen, otherwise I wouldn't have known about this. It normally runs
> the backup silently from a script.
>
> Kinda lost confidence in LVM now. The most recent HOWTO I could find is
> little more than a list of available commands, but I need more than
> that.
>
>>I don't recall having to do anything special to make it work (other than
>>swear, curse and generally get annoyed with Dell) - I did compile a
>>customer kernel with all big big-file stuff ticked, but I imagine you
>>have that in your system.
>
> Yes
>
>> This was running Debian Etch IIRC.
>
> CentOS 5.2
>
>>With a 4KB block size the max size of an ext2/3 partition is 16TB.
>
> Ta.
>
>>Could you have formatted it with a 1K block size? That would limit it to
>>2TB .
>
> My thought too. Trouble is, it's all sorts of nonsense with external
> SCSI arrays and warnings of the OS not being able to access them if they
> exceed 2TB in size, deciding whether to use a GUID partition table or
> not, and using the pv* and lv* tools to create the logical drive. The
> machine is live and so I'm not able to take it down to fiddle, and I
> can't get settled to sit and spend a good session with it, and and and
> <grrr>
>
>>mkfs usually picks the "right thing" though - but if you did mkfs.ext2
>>by hand, or some external program, maybe it got it wrong?
>
> I used the following line:
> mke2fs -v -j -m 0 /dev/md0
>
> The -j is all I need to make it ext3, right? No further prep needed?
>
> I didn't specify blocks in mke2fs, but did in the LVM creation: lvcreate
> -i 2 -I 4 -l 75155 -n LogVol02 VolGroup02 /dev/sdc1 /dev/sdc2
>
> Don't think I trust this any more. I'll take the safe option and delete
> the RAID partitions then reformat as standard ext3. I'll end up with
> two partitions rather than one I wanted, but can work with that.
>
> TYVM for your help (also to anahata)

It should work. I have ext3 partitions of 6 and 8TB on RedHat 5.4 systems.

About the only thing you are doing differently to me is that you are
striping your LUNs using LVM. Mine are both single LUNs from fibre
attached RAIDs. If you want a partition table on them (there's no
requirement to do so, just format the entire device) you have to use GPT,
I think DOS partition tables are limited to 2TB per partition. My 6TB LUN
has no partition table, the 8TB one has a GPT table for no other reason
than because I wanted to see what happened when I created one.

The SCSI errors tell me there's something wrong in the comms. between the
host and the RAID controller. Are the errors always on sd 1:0:3:0?

I'd first unmount any filesystems on those RAIDs and try power cycling
the units. Then, if that doesn't work, power cycle the host. I have
occasionally seen SCSI errors due to malfunctioning SCSI device drivers
and/or SCSI controllers which are magically fixed by power cycling. Also,
when you are doing this double check the SCSI cabling and termination,
SCSI has zero tolerance for anything which is not optimal.

--
Nigel Wade


From: Gordon Henderson on
In article <CH6YmgAyaTkLFwHP(a)none.invalid>,
Mike Tomlinson <none(a)none.invalid> wrote:
>In article <hmr9be$uvo$1(a)energise.enta.net>, Gordon Henderson
><gordon+usenet(a)drogon.net> writes
>
>>I tried LVM in the early days - got burnt, tried it brifly in the
>>LVM2 days, but didn't really see the point
>
>But it's the default on most distro installs! First time I saw it
>(Fedora Core 1 maybe) I just thought "why?!" On a _single-disk_
>workstation running the default CD-boot install...

Is it? *boggle* However I stick with Debian - I think it gives me a
choice when I do the base install, but I've never bothered with it.

>>That's just the underlying block device - the blocks in the mkfs is a
>>separate quantity used by the filing system - it specifies the minimum
>>amount of disk space that a file can actually take up.
>
>That's part of the problem - the data to be preserved consists of many
>tens of thousands of smallish files (10Mb ea compressed), which
>filesystems do not like. Not all in one directory though :o)

Newer ext3's are OK (with lots of files in one directory) with the
dir_index option - which ought to be on by default, but you can upgrade
a disk if it was created with an older version. Needs to be offline and
a full fsck run though.

>Why two different software RAID systems (LVM and mdX)? What's the
>point? I thought LVM was the all-new dancing replacement for the old
>mdX system.

Well.. The only reason I looked at LVM was for the snapshot facility,
but in the end I did without it - my experiments way back resulted in
criplingly slow systems after a few snapshots had been taken, that and
random crashes made m give up on it and move to plan B.

>And I would have to test it (mdX with a very big disk very carefully.
>
>TVM again for the ideas. It's been good to talk to someone with a
>similar setup.
>
>And now pub beckons. Have a good weekend.

Indeed!

Cheers,

Gordon

From: Chris Davies on
Gordon Henderson <gordon+usenet(a)drogon.net> wrote:
> That Dell box was a pig and mis-sold to me - I wanted Just a Box of disks
> with each disk device directly seeable to Linux. Dell told me that was
> what I could have, but they lied, and to bypass their built-in RAID
> controller, I had to create 15 single-drive RAID-0 units which Linux
> then saw as sda though sdo.

This isn't quite the same: the PERC will still hide some disk failures
from you. If you haven't already got it, go and install OpenManage
(at least deb format; probably rpm and tar, too) and hack a cron job
together to do what mdadm --monitor would do for "real" drives.

Get back to me (here) if you would like my script.
Chris
From: Gordon Henderson on
In article <rh5867xv81.ln2(a)news.roaima.co.uk>,
Chris Davies <chris(a)roaima.co.uk> wrote:
>Gordon Henderson <gordon+usenet(a)drogon.net> wrote:
>> That Dell box was a pig and mis-sold to me - I wanted Just a Box of disks
>> with each disk device directly seeable to Linux. Dell told me that was
>> what I could have, but they lied, and to bypass their built-in RAID
>> controller, I had to create 15 single-drive RAID-0 units which Linux
>> then saw as sda though sdo.
>
>This isn't quite the same: the PERC will still hide some disk failures
>from you.

Yes it did - I couldn't run any SMART stuff IIRC.

> If you haven't already got it, go and install OpenManage
>(at least deb format; probably rpm and tar, too) and hack a cron job
>together to do what mdadm --monitor would do for "real" drives.
>
>Get back to me (here) if you would like my script.

No wories - this was some years back and that box has since "moved
on", (it was sold on ebay after the company, er, re-structured), and
I'll never buy/specify a Dell JBOD system ever again because of it.
It was overcomplicated and expensive for what I needed at the time,
but manglement had dictated that because we were no longer a thrifty
startup we ought to be buying "proper" servers than building our own...

It was fast though - I recall write speeds of round about 260MB/sec into
a RAID-6 array - it was split over 2 SAS controllers with 6 drives on
one chain and 7 on the other... Dell wouldn't offer RAID-6 and having
suffered total array failures after reconstruction failures of a RAID-5
set, I wasn't going back there again...

Gordon
From: Nix on
On 7 Mar 2010, Paul Martin uttered the following:

> btrfs allows both growth and shrinkage. I have my /home on btrfs at
> the moment, and am keeping regular backups. It seems OK at the moment.

So, does the snapshotty stuff make it worth using?