From: Janis Papanagnou on
Stephane CHAZELAS wrote:
> 2010-01-24, 04:02(+01), Janis Papanagnou:
> [...]
>> Frankly, I've never used an old UNIX edition 6 bourne shell and don't know
>> how the case statement worked at that time, or whether the case statement
>> was existing at all. [...]

....or weather the bourne shell existed at all.

> Again, there was no Bourne shell in V6, the Bourne shell was
> first released in V7. V6's sh was the Thomson shell.

Thanks for clarifying. My UNIX time began later, somewhere between Edition 7
and Sys V, and I've never used Edition 6 or have knowledge about details of
that ancient UNIX release; which is what I meant to say above.

Janis
From: Icarus Sparry on
On Sun, 24 Jan 2010 14:23:59 +0100, Janis Papanagnou wrote:

> Stephane CHAZELAS wrote:
>> 2010-01-24, 06:11(+01), Janis Papanagnou:
>>> Seebs wrote:
>>>> I'm not sure that even ksh can do everything posix REs can.
>>> I am confident and quite sure it does. Vice versa; I think the regexp
>>> library will at least have problems emulating ksh's !(...)
>>> construct. Ever tried? In general you'll get extremely bulky results
>>> here! But the class of languages (regular expressions) is the same,
>>> anyway.[*]
>>>
>>> [*] N.B. Newer ksh's also support back-references in their
>>> expressions, so strictly speeking, with that feature, they exceed the
>>> Chomsky-3 grammar class as well (analogous to other libraries with
>>> backreference extensions).
>> [...]
>>
>> Recent versions of AT&T ksh can also convert globbing patterns to
>> regular expressions (a AT&T variant thereof):
>>
>> $ ksh -c 'printf "%R\n" "!(...)"'
>> ^(\.\.\.)!$
>
> I didn't knew that ! were a regular expression meta operator in regexp.
>
> The ! _regexp_ meta operator does not seem to produce results on my box.
>
> $ ls
> hello hello world helloworld regexp world $ ls !(hello)
> hello world helloworld regexp world $ ksh -c 'printf "%R\n"
> "!(hello)"'
> ^(hello)!$
> $ ls | egrep '^(hello)!$'
> ### nothing matched ###
>
> It works well with the known operators + and * etc. resp. +(...) *(...)
> etc.
>
> Still interested in an regexp expression conforming to !(...) ext. glob.

The grep from AST will handle this '^(hello)!$' if you give it a -X flag
to enable the augmented expressions.

An instant thought on matching !(hello) with a RE, this is

^([^h].*|h[^e].*|he[^l].*|hel[^l].*|hell[^o].*|hello..*)$

There has been discussion on comp.compilers recently on calculating the
difference between two regular expressions.
From: Janis Papanagnou on
Icarus Sparry wrote:
> On Sun, 24 Jan 2010 14:23:59 +0100, Janis Papanagnou wrote:
>
>> Stephane CHAZELAS wrote:
>>> 2010-01-24, 06:11(+01), Janis Papanagnou:
>>>> Seebs wrote:
>>>>> I'm not sure that even ksh can do everything posix REs can.
>>>> I am confident and quite sure it does. Vice versa; I think the regexp
>>>> library will at least have problems emulating ksh's !(...)
>>>> construct. Ever tried? In general you'll get extremely bulky results
>>>> here! But the class of languages (regular expressions) is the same,
>>>> anyway.[*]
>>>>
>>>> [*] N.B. Newer ksh's also support back-references in their
>>>> expressions, so strictly speeking, with that feature, they exceed the
>>>> Chomsky-3 grammar class as well (analogous to other libraries with
>>>> backreference extensions).
>>> [...]
>>>
>>> Recent versions of AT&T ksh can also convert globbing patterns to
>>> regular expressions (a AT&T variant thereof):
>>>
>>> $ ksh -c 'printf "%R\n" "!(...)"'
>>> ^(\.\.\.)!$
>> I didn't knew that ! were a regular expression meta operator in regexp.
>>
>> The ! _regexp_ meta operator does not seem to produce results on my box.
>>
>> $ ls
>> hello hello world helloworld regexp world $ ls !(hello)
>> hello world helloworld regexp world $ ksh -c 'printf "%R\n"
>> "!(hello)"'
>> ^(hello)!$
>> $ ls | egrep '^(hello)!$'
>> ### nothing matched ###
>>
>> It works well with the known operators + and * etc. resp. +(...) *(...)
>> etc.
>>
>> Still interested in an regexp expression conforming to !(...) ext. glob.
>
> The grep from AST will handle this '^(hello)!$' if you give it a -X flag
> to enable the augmented expressions.

Interesting to know. Thanks! This is unique to the AST grep, I suppose.

> An instant thought on matching !(hello) with a RE, this is
>
> ^([^h].*|h[^e].*|he[^l].*|hel[^l].*|hell[^o].*|hello..*)$

Yes, something like that was what I had in mind when I spoke of "extremely
bulky results".

> There has been discussion on comp.compilers recently on calculating the
> difference between two regular expressions.

(Must have a look to understand what's meant with differences between REs.)

Janis
From: pk on
Janis Papanagnou wrote:

> Still interested in an regexp expression conforming to !(...) ext. glob.

Well if lookaround assertions are available, it's trivially done with
something like

(?!whatyoudontwant).*.ext

or equivalent.

(Yes, we're way out of the "regular" regexp domain here, probably even more
than with backreferences)

However, I don't think any shell implements that.
From: Kaz Kylheku on
On 2010-01-24, Icarus Sparry <usenet(a)icarus.freeuk.com> wrote:
> On Sun, 24 Jan 2010 14:23:59 +0100, Janis Papanagnou wrote:
>
>> Stephane CHAZELAS wrote:
>>> 2010-01-24, 06:11(+01), Janis Papanagnou:
>>>> Seebs wrote:
>>>>> I'm not sure that even ksh can do everything posix REs can.
>>>> I am confident and quite sure it does. Vice versa; I think the regexp
>>>> library will at least have problems emulating ksh's !(...)
>>>> construct. Ever tried? In general you'll get extremely bulky results
>>>> here! But the class of languages (regular expressions) is the same,
>>>> anyway.[*]
>>>>
>>>> [*] N.B. Newer ksh's also support back-references in their
>>>> expressions, so strictly speeking, with that feature, they exceed the
>>>> Chomsky-3 grammar class as well (analogous to other libraries with
>>>> backreference extensions).
>>> [...]
>>>
>>> Recent versions of AT&T ksh can also convert globbing patterns to
>>> regular expressions (a AT&T variant thereof):
>>>
>>> $ ksh -c 'printf "%R\n" "!(...)"'
>>> ^(\.\.\.)!$
>>
>> I didn't knew that ! were a regular expression meta operator in regexp.
>>
>> The ! _regexp_ meta operator does not seem to produce results on my box.
>>
>> $ ls
>> hello hello world helloworld regexp world $ ls !(hello)
>> hello world helloworld regexp world $ ksh -c 'printf "%R\n"
>> "!(hello)"'
>> ^(hello)!$
>> $ ls | egrep '^(hello)!$'
>> ### nothing matched ###
>>
>> It works well with the known operators + and * etc. resp. +(...) *(...)
>> etc.
>>
>> Still interested in an regexp expression conforming to !(...) ext. glob.
>
> The grep from AST will handle this '^(hello)!$' if you give it a -X flag
> to enable the augmented expressions.
>
> An instant thought on matching !(hello) with a RE, this is
>
> ^([^h].*|h[^e].*|he[^l].*|hel[^l].*|hell[^o].*|hello..*)$

Good start.

A regular expression denotes a set. The expression "hello" denotes
the set { "hello" }. The complement of the regular expression is the
complement of this set: it is the (infinite) set of all strings
except "hello".

This set includes:

- the empty string;
- all strings of length 1 through length 4;
- all strings of length 5 except "hello"; and
- all strings of length 6 and greater

By articulating the complement result as the above set, we can now write
the expression as a union of the constituent categories:

(|.|..|...|....|[^h]....|h[^e]...|he[^l]..|hel[^l].|hell[^o]|.......*)

I recently implemented an alternative regexp engine in Txr which supports
intersection and complement. I wrote up a bit of a regex discussion
section in the man page: http://www.nongnu.org/txr/txr-manpage.html