From: pk on
I'm surely missing something obvious here.

I have a stream of data coming from a pipeline. The format is something like

....
aaaaaaaa
bbbbbbbb
cccccccc
END
jjjjjjjj
kkkkkkkk
END
llllllll
END
mmmmmmmm
nnnnnnnn
oooooooo
pppppppp
END
etc.etc.

that is, the data is composed of logical "blocks", of variable length, each
block ends at (doh) the END line. Now, I have another command that can
process individual blocks, but it's not able to take the full stream. It
should be given one block at a time.

So the straightforward loop

..... | while IFS= read -r line; do
block="${block}${line}\n"

if [ "$line" = "END" ]; then
# yes it's unsafe but that's not the point here
printf "$block" | command
block=
fi
done > final_output.txt

works, but I have the impression that I'm overcomplicating it. However, I
cannot find a simpler way. Any suggestion?
From: Bill Marcum on
On 2010-03-01, pk <pk(a)pk.invalid> wrote:
>
> So the straightforward loop
>
> .... | while IFS= read -r line; do
> block="${block}${line}\n"
>
> if [ "$line" = "END" ]; then
> # yes it's unsafe but that's not the point here
> printf "$block" | command
> block=
> fi
> done > final_output.txt
>
> works, but I have the impression that I'm overcomplicating it. However, I
> cannot find a simpler way. Any suggestion?

Untested:
awk 'BEGIN{RS=ORS="END\n"} {print | "command"; close "command"}'
From: Janis Papanagnou on
pk wrote:
> I'm surely missing something obvious here.
>
> I have a stream of data coming from a pipeline. The format is something like
>
> ...
> aaaaaaaa
> bbbbbbbb
> cccccccc
> END
> jjjjjjjj
> kkkkkkkk
> END
> llllllll
> END
> mmmmmmmm
> nnnnnnnn
> oooooooo
> pppppppp
> END
> etc.etc.
>
> that is, the data is composed of logical "blocks", of variable length, each
> block ends at (doh) the END line. Now, I have another command that can
> process individual blocks, but it's not able to take the full stream. It
> should be given one block at a time.
>
> So the straightforward loop
>
> .... | while IFS= read -r line; do
> block="${block}${line}\n"
>
> if [ "$line" = "END" ]; then
> # yes it's unsafe but that's not the point here
> printf "$block" | command
> block=
> fi
> done > final_output.txt
>
> works, but I have the impression that I'm overcomplicating it. However, I
> cannot find a simpler way. Any suggestion?

awk '{ print | "command" }
/^END$/ { close("command") }'


Janis
From: pk on
Janis Papanagnou wrote:

>> works, but I have the impression that I'm overcomplicating it. However, I
>> cannot find a simpler way. Any suggestion?
>
> awk '{ print | "command" }
> /^END$/ { close("command") }'

Yes, thanks (and to Bill). I was thinking of something more shell-ish rather
than calling external commands in awk, but that'll do.

Thank you!

From: Ed Morton on
On 3/2/2010 3:09 AM, pk wrote:
> Janis Papanagnou wrote:
>
>>> works, but I have the impression that I'm overcomplicating it. However, I
>>> cannot find a simpler way. Any suggestion?
>>
>> awk '{ print | "command" }
>> /^END$/ { close("command") }'
>
> Yes, thanks (and to Bill). I was thinking of something more shell-ish rather
> than calling external commands in awk, but that'll do.
>
> Thank you!
>

How about something like (untested, but I know you know awk...):

awk -v RS="END" -v ORS="\n" -F FS="\n" -v OFS="^L" '{$1=$1}1' file |
while IFS= read -r block; do
echo "$block" | tr '^L' '\n' | command
done

where the ^L is control-L or some other control character that's not in your input.

Regards,

Ed.