From: www.isp2dial.com on
It's been a while since I read c.u.s ... I puzzled for a while over
this problem, maybe the solution will be useful to someone besides me.

Say you want to send one stream down two pipes, and process each one
independently. Like having two independent pointers on the same input
stream. I could not discover any such thing in bash, so I worked it
out using FIFOs and tee.

In the example below, I use find to build a list of mailbox files for
input to sa-learn, and then I truncate the files. I want to process
all the files with a single call of sa-learn (using xargs), so the
first stream has to be completely processed before the second starts.

That was tricky, but I solved it using nested subshells and wait.

Stephane probably has some better way to do this, but here is what I
worked out. :-)


#!/bin/bash

set -B -e +h -u -o pipefail; shopt -s extglob nullglob

pushd . > /dev/null
cd ~/temp

td=`mktemp -d`
mkfifo "$td/3"
mkfifo "$td/4"

( (
exec < "$td/3"
xargs -0 -r sa-learn --spam --mbox
) &
exec < "$td/4"
wait
while read -d $'\0'; do
cp /dev/null "$REPLY"
done
) &

find . -type f -size +0c \( \
-wholename './UCE/*' -o \
-name 'uce.*' \
\) -printf '%P\0' |
sort -z | tee "$td/3" "$td/4" > /dev/null

wait
rm -rf "$td"

popd > /dev/null


--
Internet service
http://www.isp2dial.com/

From: Dan Stromberg on
On Sat, 05 Apr 2008 00:54:23 +0000, www.isp2dial.com wrote:

> It's been a while since I read c.u.s ... I puzzled for a while over
> this problem, maybe the solution will be useful to someone besides me.
>
> Say you want to send one stream down two pipes, and process each one
> independently. Like having two independent pointers on the same input
> stream. I could not discover any such thing in bash, so I worked it out
> using FIFOs and tee.
>
> In the example below, I use find to build a list of mailbox files for
> input to sa-learn, and then I truncate the files. I want to process all
> the files with a single call of sa-learn (using xargs), so the first
> stream has to be completely processed before the second starts.
>
> That was tricky, but I solved it using nested subshells and wait.
>
> Stephane probably has some better way to do this, but here is what I
> worked out. :-)
>
>
> #!/bin/bash
>
> set -B -e +h -u -o pipefail; shopt -s extglob nullglob
>
> pushd . > /dev/null
> cd ~/temp
>
> td=`mktemp -d`
> mkfifo "$td/3"
> mkfifo "$td/4"
>
> ( (
> exec < "$td/3"
> xargs -0 -r sa-learn --spam --mbox
> ) &
> exec < "$td/4"
> wait
> while read -d $'\0'; do
> cp /dev/null "$REPLY"
> done
> ) &
>
> find . -type f -size +0c \( \
> -wholename './UCE/*' -o \
> -name 'uce.*' \
> \) -printf '%P\0' |
> sort -z | tee "$td/3" "$td/4" > /dev/null
>
> wait
> rm -rf "$td"
>
> popd > /dev/null

IMO, mtee is a lot simpler (at least in one's shell code) :

http://stromberg.dnsalias.org/~strombrg/mtee.html

....but ptee probably would've been a better name than mtee. :)

From: pk on
(sorry to reply to Dan, but the original message did not arrive at my NNTP
service)

Dan Stromberg wrote:

>> #!/bin/bash
>>
>> set -B -e +h -u -o pipefail; shopt -s extglob nullglob
>>
>> pushd . > /dev/null
>> cd ~/temp
>>
>> td=`mktemp -d`
>> mkfifo "$td/3"
>> mkfifo "$td/4"
>>
>> ( (
>> exec < "$td/3"
>> xargs -0 -r sa-learn --spam --mbox
>> ) &
>> exec < "$td/4"
>> wait
>> while read -d $'\0'; do
>> cp /dev/null "$REPLY"
>> done
>> ) &
>>
>> find . -type f -size +0c \( \
>> -wholename './UCE/*' -o \
>> -name 'uce.*' \
>> \) -printf '%P\0' |
>> sort -z | tee "$td/3" "$td/4" > /dev/null
>>
>> wait
>> rm -rf "$td"
>>
>> popd > /dev/null

But since you need the first pipeline to finish anyway before starting the
second, why not just do something like

find .... | sort -z | tee tmpfile | xargs -0 -r sa-learn --spam --mbox

and then:

while read -d $'\0'; do
cp /dev/null "$REPLY"
done < tmpfile

Are there specific reasons you need to do things the way you did?
Just curious...

--
All the commands are tested with bash and GNU tools, so they may use
nonstandard features. I try to mention when something is nonstandard (if
I'm aware of that), but I may miss something. Corrections are welcome.
From: www.isp2dial.com on
On Sat, 05 Apr 2008 11:40:25 +0200, pk <pk(a)pk.invalid> wrote:

>(sorry to reply to Dan, but the original message did not arrive at my NNTP
>service)

>But since you need the first pipeline to finish anyway before starting the
>second, why not just do something like
>
>find .... | sort -z | tee tmpfile | xargs -0 -r sa-learn --spam --mbox
>
>and then:
>
>while read -d $'\0'; do
> cp /dev/null "$REPLY"
>done < tmpfile
>
>Are there specific reasons you need to do things the way you did?
>Just curious...

Yours is a good solution, if you don't mind using a tmpfile. But I
wanted to avoid writing any data into the filesystem, since it's only
transient data.

I first tried using internal pipes, but could not work it out that
way, so FIFOs were my next idea.

I was also trying to discover some general method of writing one
stream to multiple pipes, and processing them in a synchronized
sequence. There have been other times when I needed to do something
like that, but I've never known how, until now.


--
Internet service
http://www.isp2dial.com/

From: www.isp2dial.com on
On Sat, 05 Apr 2008 05:33:49 GMT, Dan Stromberg
<dstromberglists(a)gmail.com> wrote:

>IMO, mtee is a lot simpler (at least in one's shell code) :
>
>http://stromberg.dnsalias.org/~strombrg/mtee.html
>
>...but ptee probably would've been a better name than mtee. :)

I looked at your web page, but when I saw "Python" I ran away.

Not that I think there's anything wrong with Python. I'm just too old
to learn any big tricks. Small tricks are all I can do. ;-)


--
Internet service
http://www.isp2dial.com/

 |  Next  |  Last
Pages: 1 2 3
Prev: sed
Next: Writing shell scripts to automate