From: John Kelly on
On Tue, 15 Jun 2010 18:22:37 +0100, pk <pk(a)pk.invalid> wrote:

>> It's not hard. Try it on some junk data to get the hang of it. Once
>> you learn how to use dd, you can do ANYTHING. Including wiping out your
>> hard drive, heh.
>
>in this case, perhaps "split" could also have been used.

Never heard of it till now, but I see it with "man split"

Doesn't sound like as much fun as dd though, you can't wipe out your
hard drive with split. :-D



--
Web mail, POP3, and SMTP
http://www.beewyz.com/freeaccounts.php

From: Loki Harfagr on
Tue, 15 Jun 2010 18:46:54 +0200, Tuxedo did cat :

> John Kelly wrote:
>
>> On Tue, 15 Jun 2010 16:20:20 +0000, John Kelly <jak(a)isp2dial.com>
>> wrote:
>>
>>
>> >"seek" determines the starting block
>>
>> Whoops! I meant skip, not seek
>>
>>
>> >Experiment with some junk file till you get the hang of it. But be
>> >careful, you can really hurt yourself with dd while logged in as root.
>>
>> See what I mean
>
> I think this is all too complex for my basic understanding of dd :-)
>
> Tuxedo

Did you try my suggestion with multithread formail/procmail?
Did it fail?
If so, maybe try another simple man idea with awk (not tested on Solaris ;-)
$ awk '/^From /{a++}{print >"/tmp/_mb_"a}' yourBigBox

(adapt "/tmp/_mb_" prefix part for your fav path)

or maybe try it with 'csplit' (read the info page for extended parms on paths)
$ csplit -z yourBigBox '/^From /' '{*}'

Note that these are really not foolproof in case some mails in the box self content
other mails you may have then to check that point and re-glue some stuff back
but if your box was a simple man mbox you should be safe with that by now :-)
From: John Kelly on
On 15 Jun 2010 17:45:35 GMT, Loki Harfagr
<l0k1(a)thedarkdesign.free.fr.INVALID> wrote:

>try it with 'csplit' (read the info page for extended parms on paths)
>$ csplit -z yourBigBox '/^From /' '{*}'

That's a good one to remember.



--
Web mail, POP3, and SMTP
http://www.beewyz.com/freeaccounts.php

From: Chris F.A. Johnson on
On 2010-06-15, Tuxedo wrote:
> Chris F.A. Johnson wrote:
>
> [...]
>
>> Use formail:
>>
>> formail -s savemail < "$mbox"
>>
>> Where savemail is a script containing:
>>
>> cat > $(date +%Y-%m-%d_%H:%M:%S)-$(uuidgen)
>>
>> This will put each message in a separate file. Adjust to taste if
>> you want to put more than one message into each file or to use
>> different filenames.
>
> Thanks for this proceure, it works fine on a not-too-large mbox. However,
> it fails with the huge file that that the system runs out of memory,

It works for me on an mbox file larger than my total RAM.

> as I guess cat or formail tries to read in the full file to process.
> But it's a good example how to split an mbox into individual files.
> I will probably use this idea for something else.


--
Chris F.A. Johnson, author <http://shell.cfajohnson.com/>
===================================================================
Shell Scripting Recipes: A Problem-Solution Approach (2005, Apress)
Pro Bash Programming: Scripting the GNU/Linux Shell (2009, Apress)

From: Tuxedo on
John Kelly wrote:

[...]

> It's not hard. Try it on some junk data to get the hang of it. Once
> you learn how to use dd, you can do ANYTHING. Including wiping out your
> hard drive, heh.

Thanks for the quick tutorial. The wonders of dd has finally come to light!
I first tested it on a smaller already functioning mbox.

I thereafter tested to copy 100 bytes of the beginning of the huge file:
dd count=1 bs=100 if=myBigCrapBox of=myBigCrapBox.1

But thereby I realise there must be something wrong with the huge mbox
file. The resulting file, myBigCrapBox.1, should be the first 100 bytes
ASCII but it all appears to be binary data, or nothing at all; one editor
(Nedit) just shows the file with a long line of <nul><nul><nul>, while the
filesize is exactly 100 bytes.

I'm not sure how this happened because all other smaller mboxes created by
Mozilla Thunderbird are indeed in plain text format.

Does anyone know if the Mozilla mail applications use some kind of exotic
compression format for files above a certain size?

I even tested placing the resulting file 100 bytes file in a Mozilla mail
directory. The mailfolder (or file) shows up but is empty, not even a start
of a single message. I also tested with version longer than the 100 bytes.

I guess I have been doomed with a corrupt mbox file! But how can such large
2.8GB file contain nothing readable? It should be a direct copy of the mbox
and a full version of the file, not a truncated 2GB limit file via ftp or
other file transfer. I copied the file from the original Windows drive via
USB Flash media directly onto a Linux system where I ran the dd command.

Thanks for any advise or theories on how this possibly corrupt mbox may be
reinvigorated and viewed.

Tuxedo