search pattern and display lines around the pattern [Shell]

Prev: jmanage start/stop init.d bash script
Next: Escaping regexp meta characters

From: John W. Krahn on 26 Feb 2010 11:50

Ed Morton wrote:
> On 2/25/2010 3:36 PM, John W. Krahn wrote:
>> Harry wrote:
>>> I have about 30 text files containing some MQ object definitions.
>>> I want to locate some CHANNEL definitions with a pattern
>>> "SSLCAUTH(REQUIRED)".
>>
>> ..." | perl -ne'$_ .= <> and redo if s/\+$//; print grep /DEFINE
>> CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/, /.*\n/g'
>> DEFINE CHANNEL ('AAA.SVRCONN') CHLTYPE(SVRCONN)
>> SSLCAUTH(REQUIRED)
>> SSLCIPH(' ')
>
> Please don't take this as a knock against perl as it's not intended that
> way, I'm just genuinely curious. Maybe it's just me but I find the
> syntax of:
>
> perl -ne'$_ .= <> and redo if s/\+$//; print grep /DEFINE
> CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/, /.*\n/g'
>
> very unintuitive compared to this awk syntax to produce the same output:
>
> awk 'BEGIN{RS="";FS="\n"}
> /SSLCAUTH$REQUIRED$/{
> for (i=1;i<=NF;i++)
> if ($i ~ /DEFINE CHANNEL|SSLCAUTH|SSLCIPH/)
> print $i
> }'

The perl version of that would be:

perl -F'\n' -00 -ane'
if ( /SSLCAUTH$REQUIRED$/ ) {
for ( @F ) {
if ( /DEFINE CHANNEL|SSLCAUTH|SSLCIPH/ ) { print "$_\n" }
}
}'

> Is the perl syntax you used above just a choice you made for the sake of
> brevity

Sort of. The algorithm is different. The perl script above
concatenates together all the lines that end with a '+' and then loops
through that one string using grep() to pick out the correct "lines".

> and in reality it's possible to do the job in perl using a
> syntax that's similar to the awk one? If so, what would that look like?
>
> I suppose I could do something similar to the perl syntax above in GNU
> awk like:
>
> gawk -v RS= '$0!=($0=gensub(/(DEFINE
> CHANNEL[^\n]*).*(SSLCAUTH$REQUIRED$).*(SSLCIPH[^\n]*).*/,"\\1\n\\2\n\\3\n",""))'
>
>
> but in reality I wouldn't for the sake of clarity and portability to
> other awks.

John
--
The programmer is fighting against the two most
destructive forces in the universe: entropy and
human stupidity. -- Damian Conway

From: Harry on 26 Feb 2010 11:54

On Feb 26, 5:49 am, Ed Morton <mortons...(a)gmail.com> wrote:
> On 2/25/2010 10:04 PM, Harry331 wrote:
>
> > Harry wrote...
> <snip>
> >> How can I display the file names as well ?
> <snip>
> > Here is my latest revision.
>
> > find . -type f -exec echo \; -exec echo {} \; -exec
> > awk 'BEGIN{RS="";FS="\n"} /SSLCAUTH$REQUIRED$/ { for
> > (i=1;i<=NF;i++) if ($i ~ /DEFINE|SSLCAUTH|SSLCIPH/) print
> > $i}' {} \;
>
> In awk the "FILENAME" variable contains (surprise!) the file name so depending
> on how you want your output to be formatted you could do something like either
> of these:
>
> awk 'BEGIN{RS="";FS="\n"}
> /SSLCAUTH$REQUIRED$/ {
> for (i=1;i<=NF;i++)
> if ($i ~ /DEFINE|SSLCAUTH|SSLCIPH/)
> print FILENAME,$i
> print ""
> }'
>
> awk 'BEGIN{RS="";FS="\n"}
> /SSLCAUTH$REQUIRED$/ {
> print FILENAME
> for (i=1;i<=NF;i++)
> if ($i ~ /DEFINE|SSLCAUTH|SSLCIPH/)
> print $i
> print ""
> }'
>
> I also threw in an extra print "" to give you a blank line between records in
> case that's useful for separating them.
>
> Regards,
>
> Ed.

Perfect!

Thanks a lot, Ed.

From: Ed Morton on 26 Feb 2010 12:01

On 2/26/2010 10:50 AM, John W. Krahn wrote:
> Ed Morton wrote:
>> On 2/25/2010 3:36 PM, John W. Krahn wrote:
>>> Harry wrote:
>>>> I have about 30 text files containing some MQ object definitions.
>>>> I want to locate some CHANNEL definitions with a pattern
>>>> "SSLCAUTH(REQUIRED)".
>>>
>>> ..." | perl -ne'$_ .= <> and redo if s/\+$//; print grep /DEFINE
>>> CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/, /.*\n/g'
>>> DEFINE CHANNEL ('AAA.SVRCONN') CHLTYPE(SVRCONN)
>>> SSLCAUTH(REQUIRED)
>>> SSLCIPH(' ')
>>
>> Please don't take this as a knock against perl as it's not intended
>> that way, I'm just genuinely curious. Maybe it's just me but I find
>> the syntax of:
>>
>> perl -ne'$_ .= <> and redo if s/\+$//; print grep /DEFINE
>> CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/, /.*\n/g'
>>
>> very unintuitive compared to this awk syntax to produce the same output:
>>
>> awk 'BEGIN{RS="";FS="\n"}
>> /SSLCAUTH$REQUIRED$/{
>> for (i=1;i<=NF;i++)
>> if ($i ~ /DEFINE CHANNEL|SSLCAUTH|SSLCIPH/)
>> print $i
>> }'
>
> The perl version of that would be:
>
> perl -F'\n' -00 -ane'
> if ( /SSLCAUTH$REQUIRED$/ ) {
> for ( @F ) {
> if ( /DEFINE CHANNEL|SSLCAUTH|SSLCIPH/ ) { print "$_\n" }
> }
> }'

Got it. Thanks!

>
>> Is the perl syntax you used above just a choice you made for the sake
>> of brevity
>
> Sort of. The algorithm is different. The perl script above concatenates
> together all the lines that end with a '+' and then loops through that
> one string using grep() to pick out the correct "lines".

OK, I get that concept, but I don't understand it knows when it finds a "DEFINE
CHANNEL" line in the concatenated string that it will later find a
"SSLCAUTH$REQUIRED$" string on a subsequent line so it's OK to print that
first line.

Ed.
>
>
>> and in reality it's possible to do the job in perl using a syntax
>> that's similar to the awk one? If so, what would that look like?
>>
>> I suppose I could do something similar to the perl syntax above in GNU
>> awk like:
>>
>> gawk -v RS= '$0!=($0=gensub(/(DEFINE
>> CHANNEL[^\n]*).*(SSLCAUTH$REQUIRED$).*(SSLCIPH[^\n]*).*/,"\\1\n\\2\n\\3\n",""))'
>>
>>
>> but in reality I wouldn't for the sake of clarity and portability to
>> other awks.
>
>
>
> John

From: pk on 26 Feb 2010 13:01

Ed Morton wrote:

>> Sort of. The algorithm is different. The perl script above concatenates
>> together all the lines that end with a '+' and then loops through that
>> one string using grep() to pick out the correct "lines".
>
> OK, I get that concept, but I don't understand it knows when it finds a
> "DEFINE CHANNEL" line in the concatenated string that it will later find a
> "SSLCAUTH$REQUIRED$" string on a subsequent line so it's OK to print
> that first line.

I'm sure he'll be able to explain much better...anyway (slightly
reformatted):

-------------------
perl -ne

This (-n) enables an implicit loop over the input, ie the same that happens
in awk automatically. By default, records end at \n, like in awk. "$_" is
like awk's "$0", ie it contains the record (plus a ton of other things, but
here that's what it does). One difference is that the trailing \n is NOT
removed so it's part of "$_".
-------------------
$_ .= <> and redo if s/\+$//;

(Note the "quasi natural language" syntax of this Perl code)
This concatenates successive input lines as long as the resulting string
ends in "+". <> is (VERY roughly) like awk's getline here. "." is the
concatenation operator, s/// is like sed's substitution and by default it
operates on "$_" (and the result is true if the replacement is performed),
so the above code is similar (not completely equivalent) to this awk code:

while(sub(/\+$/,"")) { getline a; $0=$0 "\n" a; }; $0=$0 "\n"

Note that, even if trailing newline is preserved in Perl, /\+$/ matches what
one expects.
-------------------

print grep /DEFINE CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/, /.*\n/g'

This is a very idiomatic construct. We should start from the end.

/.*\n/g

/pattern/ is Perl's match operator, and by default it matches against "$_".
The /g modifier performs a global match, ie match as many times as possible.
Believe it or not, due to Perl rules the result of the above "/.*\n/g" is an
array containing all the "lines" in "$_". That would be like awk's

n = split($0, lines, /\n/)

except again each array element does end in "\n" in the Perl version.

The Perl array is unnamed, and "grep" operates on that array.
The syntax used for grep here is

grep EXPRESSION,LIST

where the unnamed array is the LIST part in the above synopsis.
What grep does is to apply the EXPRESSION to each element of LIST, and
return the elements for which EXPRESSION is true.

EXPRESSION here is

/DEFINE CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/

so it is a match against the enclosed pattern. So, each line in the unnamed
array is matched against that pattern, and if it matches grep returns it.

In awk it would be something like:

count=0
for(i=1;i<=n;i++){
if(lines[i] ~ /DEFINE CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/) {
count++
matchinglines[count] = lines[i]
}
}

with again the difference that in awk you have to explicitly name the
arrays.

The result of grep is again a (possibly smaller) unnamed array, which is fed
to "print" to be printed. Since each element already has its original
trailing newline, it will be printed in a separate line. To complete, in awk
that would be

for(i=1;i<=count;i++){
print matchinglines[i]
}

So the complete awk code would be (untested!)

awk '{
while(sub(/\+$/,"")) { getline a; $0=$0 "\n" a; }; $0=$0 "\n"
n = split($0, lines, /\n/)
count=0
for(i=1;i<=n;i++){
if(lines[i] ~ /DEFINE CHANNEL|SSLCAUTH$REQUIRED$|SSLCIPH/) {
count++
matchinglines[count] = lines[i]
}
}
for(i=1;i<=count;i++){
print matchinglines[i]
}
}'

From: pk on 26 Feb 2010 13:15

pk wrote:

> perl -ne
>[cut]
> The Perl array is unnamed

I guess that for true Perl correctness, all those object I called "unnamed
arrays" are better called "lists". Sorry for the imprecision.

First | Prev | Next | Last
Pages: 1 2 3 4
Prev: jmanage start/stop init.d bash script
Next: Escaping regexp meta characters