|
Prev: Cheap, Chanel LV Sandals, ( AF1 X Jordan 13 Fusion Shoes ), Prada Coach Sandals
Next: Partitioning hard drives...
From: Tim Woodall on 22 Jun 2008 07:29 Friday night (actually very early Saturday morning) I started getting errors from my backups: DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: read error from /dev/vg0/var-backup: Input/output error: [block 1385034, ext2blk 0]: count=173129 DUMP: read error from /dev/vg0/var-backup: Input/output error: [sector 1385034, ext2blk 0]: count=173129 .... DUMP: read error from /dev/vg0/var-backup: Input/output error: [sector 1385147, ext2blk 0]: count=173143 DUMP: read error from /dev/vg0/var-backup: Input/output error: [sector 1385148, ext2blk 0]: count=173143 DUMP: DUMP: DUMP: DUMP: mount: you must specify the filesystem type and similar this morning: /sbin/lvcreate -A n -L500M -s -nvar-backup /dev/vg0/var Logical volume "var-backup" created /sbin/e2fsck -p /dev/vg0/var-backup /dev/vg0/var-backup: recovering journal /dev/vg0/var-backup: clean, 1933/256000 files, 400751/512000 blocks ssh -e none -i /root/.ssh/id_rsa_backup backup(a)dhcpdns 'mkdir -p /mnt/backup/dumps/mailserver.20080622.1' mount | ssh -e none -i /root/.ssh/id_rsa_backup backup(a)dhcpdns 'cat >/mnt/backup/dumps/mailserver.20080622.1/mount.log' /sbin/dump -z9 -1u -f - /dev/vg0/var-backup | ssh -e none -i /root/.ssh/id_rsa_backup backup(a)dhcpdns 'cat >/mnt/backup/dumps/mai DUMP: Date of this level 1 dump: Sun Jun 22 02:32:12 2008 DUMP: Date of last level 0 dump: Sun Jun 1 02:38:52 2008 DUMP: Dumping /dev/vg0/var-backup (an unlisted file system) to standard output DUMP: Label: none DUMP: Writing 10 Kilobyte records DUMP: Compressing output at compression level 9 (zlib) DUMP: mapping (Pass I) [regular files] DUMP: mapping (Pass II) [directories] DUMP: estimated 1383159 blocks. DUMP: Volume 1 started with block 1 at: Sun Jun 22 02:32:13 2008 DUMP: dumping (Pass III) [directories] DUMP: dumping (Pass IV) [regular files] DUMP: read error from /dev/vg0/var-backup: Input/output error: [block 1425368, ext2blk 0]: count=178171 DUMP: read error from /dev/vg0/var-backup: Input/output error: [sector 1425368, ext2blk 0]: count=178171 DUMP: read error from /dev/vg0/var-backup: Input/output error: [sector 1425369, ext2blk 0]: count=178171 ... DUMP: read error from /dev/vg0/var-backup: Input/output error: [sector 1425542, ext2blk 0]: count=178192 DUMP: DUMP: DUMP: DUMP: DUMP: DUMP: fopen on /dev/tty fails: No such device or address DUMP: The ENTIRE dump is aborted. mount: you must specify the filesystem type But I can't find what's wrong. Manually creating the snapshot and running dump -0 -f /dev/null /dev/vg0/var-backup works fine. dd if=/dev/vg0/var of=/dev/null will read the entire partition ok. ditto dd if=/dev/vg0/var-backup of=/dev/null (although I think in this case I'm really still mostly reading from /dev/vg0/var). If I create a separate non snapshot partition then that also reads OK. The VG is on a RAID on /dev/hda2 and /dev/hdc2 I've done smartctl -t long /dev/hd[ac] and there are no errors. I do notice that hdc is running hotter than hda - now the machine is mostly idle again hda is 25C while hdc is 43C. While runing the tests they were about 40C and 55C respectively. hdc is newer than hda - the original hdc (bought at the same time as hda) failed fairly quickly. smartctl says poweron hours are 8375 and 64324. (I don't believe that 64324 - that's more than 7 years - the tests say they were run at 14465 and 20459 lifetime hours which is more believable - the maxtor site says the warranty expires on 25th November 2008 for /dev/hda and 9th July 2009 for /dev/hdc) e2fsck -n -f /dev/vg0/var reports no errors. I want to identify which disk is having problems before I shutdown so I can then pull that disk. What I really don't want is a problem shutting down and then the raid getting rebuilt from the faulty disk to the good disk. I know I've got backups from Friday but I'd rather not have to go though the effort of restoring. I'm about to try dd if=/dev/hda of=/dev/null and likewise for /dev/hdc to see if that flags anything. But is there anywhere else I should be looking? The entire dump took two minutes so I don't think it's the snapshot volume getting full. (I've also noticed that dump exits with 0 even when it says "The ENTIRE dump is aborted") Tim. -- God said, "div D = rho, div B = 0, curl E = - @B/@t, curl H = J + @D/@t," and there was light. http://tjw.hn.org/ http://www.locofungus.btinternet.co.uk/
From: Andy Burns on 22 Jun 2008 07:41 On 22/06/2008 12:29, Tim Woodall wrote: > I'm about to try > dd if=/dev/hda of=/dev/null and likewise for /dev/hdc to see if that > flags anything. That was my first thought, what next would depend on the results ...
From: Tim Woodall on 22 Jun 2008 08:15 On Sun, 22 Jun 2008 12:41:36 +0100, Andy Burns <usenet.april2008(a)adslpipe.co.uk> wrote: > On 22/06/2008 12:29, Tim Woodall wrote: > >> I'm about to try >> dd if=/dev/hda of=/dev/null and likewise for /dev/hdc to see if that >> flags anything. > > That was my first thought, what next would depend on the results ... Nothing :-( Both disks have read from start to end without a murmur: hda: Peaked at 47C 80293248+0 records in 80293248+0 records out 41110142976 bytes transferred in 842.769365 seconds (48779826 bytes/sec) hdc: Peaked at 60C 80293248+0 records in 80293248+0 records out 41110142976 bytes transferred in 820.883806 seconds (50080343 bytes/sec) Maybe whatever the problem was, my fiddling has sorted it out. But I don't know what I might have done. We'll see tonight when the next backup runs. Tim. -- God said, "div D = rho, div B = 0, curl E = - @B/@t, curl H = J + @D/@t," and there was light. http://tjw.hn.org/ http://www.locofungus.btinternet.co.uk/
From: Andrew Halliwell on 22 Jun 2008 10:31 Andy Burns <usenet.april2008(a)adslpipe.co.uk> wrote: > On 22/06/2008 12:29, Tim Woodall wrote: > >> I'm about to try >> dd if=/dev/hda of=/dev/null and likewise for /dev/hdc to see if that >> flags anything. > > That was my first thought, what next would depend on the results ... badblocks perhaps? It is slightly more... thorough.. -- | spike1(a)freenet.co.uk | Windows95 (noun): 32 bit extensions and a | | | graphical shell for a 16 bit patch to an 8 bit | | Andrew Halliwell BSc | operating system originally coded for a 4 bit | | in |microprocessor, written by a 2 bit company, that| | Computer Science | can't stand 1 bit of competition. |
From: google on 23 Jun 2008 09:10
On Jun 22, 1:15 pm, Tim Woodall <devn...(a)woodall.me.uk> wrote: > > Maybe whatever the problem was, my fiddling has sorted it out. But I > don't know what I might have done. We'll see tonight when the next > backup runs. > No problems today. Oh well. Looks like I'll just have to keep an eye on it. Tim. |