From: Hongyi Zhao on
Hi all,

I've some text like the following:

80mm]{level.pdf}% Here is how to import EPS art
80mm]{p1.pdf}}
80mm]{p2.pdf}}
80mm]{method1.pdf}}
80mm]{method2.pdf}}
80mm]{method1-calculation}}
80mm]{method2-calculation}}

I want to extract the contents among the {} in the above snippet, so I
use the following code:

$awk -F'\{\}' '{ print $2 }' myfile

But I'll meet the following warnings only:

awk: warning: escape sequence `\{' treated as plain `{'
awk: warning: escape sequence `\}' treated as plain `}'

Why?

Regards.
--
..: Hongyi Zhao [ hongyi.zhao AT gmail.com ] Free as in Freedom :.
From: Kaz Kylheku on
On 2010-02-21, Hongyi Zhao <hongyi.zhao(a)gmail.com> wrote:
> Hi all,
>
> I've some text like the following:
>
> 80mm]{level.pdf}% Here is how to import EPS art
> 80mm]{p1.pdf}}
> 80mm]{p2.pdf}}
> 80mm]{method1.pdf}}
> 80mm]{method2.pdf}}
> 80mm]{method1-calculation}}
> 80mm]{method2-calculation}}
>
> I want to extract the contents among the {} in the above snippet, so I
> use the following code:
>
> $awk -F'\{\}' '{ print $2 }' myfile

You have a double escape here. The token '\{\}' has
single quotes, within which the backslash is ordinary;
so it denotes a four-character string.

Try:

echo '\{\}'

You're passing the text \{\} to awk, which is
lexically analyzing it, recognizing its own kinds
of backslash escape.

Braces don't in fact have to be escaped in this
situation. That is to say, -F{} is a valid shell token
representing a word that stands for the string "-F{}".

Braces are keywords in the shell only when not combined
with other characters. I.e. { and } are keywords,
but {} isn't, and of course neither is -F{}.

Moreover { and } have their special meaning only
in certain contexts (much like other shell keywords).
For instance "while" and "for" are keywords, but
"echo while for" doesn't fail or do anything weird;
when not used as commands, these words just
represent text that requires no escaping.


From: pk on
Hongyi Zhao wrote:

> Hi all,
>
> I've some text like the following:
>
> 80mm]{level.pdf}% Here is how to import EPS art
> 80mm]{p1.pdf}}
> 80mm]{p2.pdf}}
> 80mm]{method1.pdf}}
> 80mm]{method2.pdf}}
> 80mm]{method1-calculation}}
> 80mm]{method2-calculation}}
>
> I want to extract the contents among the {} in the above snippet, so I
> use the following code:
>
> $awk -F'\{\}' '{ print $2 }' myfile
>
> But I'll meet the following warnings only:
>
> awk: warning: escape sequence `\{' treated as plain `{'
> awk: warning: escape sequence `\}' treated as plain `}'

You've already got the answer to the "why" question. Moreover, here instead
you probably want

awk -F '[{}]' ...

or

awk -F '{|}' ...
From: Stephane CHAZELAS on
2010-02-21, 10:35(+00), pk:
[...]
> awk -F '{|}' ...

That's incorrect POSIX syntax (leads to unspecified results),
you want:

awk -F '\{|\}'

with a POSIX awk (like with gawk when POSIXLY_CORRECT is on).

awk -F '[{}]'

is definitely better here.


--
St�phane
From: pk on
Stephane CHAZELAS wrote:

> 2010-02-21, 10:35(+00), pk:
> [...]
>> awk -F '{|}' ...
>
> That's incorrect POSIX syntax (leads to unspecified results),
> you want:
>
> awk -F '\{|\}'
>
> with a POSIX awk (like with gawk when POSIXLY_CORRECT is on).

That's incorrect as well, and takes you back to the '{|}' case; if you go
that route, you need

awk -F '\\{|\\}'

due to the way awk scans strings.

I used just '{|}' because most awk nowadays either do NOT support {} as
regex characters (though it's mandated by POSIX), and those that do are
smart enough to see that there's nothing to "quantify" there and take the {
and } literally.