From: genxtech on
I am trying to learn regular expressions in python3 and have an issue
with one of the examples I'm working with.
The code is:

#! /usr/bin/env python3

import re

search_string = "[^aeiou]y$"
print()

in_string = 'vacancy'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

in_string = 'boy'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

in_string = 'day'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

in_string = 'pita'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

The output that I am getting is:
ay, ey, iy, oy and uy are not at the end of vacancy.
ay, ey, iy, oy or uy were found at the end of boy.
ay, ey, iy, oy or uy were found at the end of day.
ay, ey, iy, oy or uy were found at the end of pita.

The last line of the output is the opposite of what I expected to see,
and I'm having trouble figuring out what the issue is. Any help would
be greatly appreciated.
From: Thomas Jollans on
On Monday 09 August 2010, it occurred to genxtech to exclaim:
> I am trying to learn regular expressions in python3 and have an issue
> with one of the examples I'm working with.
> The code is:
>
> #! /usr/bin/env python3
>
> import re
>
> search_string = "[^aeiou]y$"

To translate this expression to English:

a character that is not a, e, i, o, or u, followed by the character 'y', at
the end of the line.

"vacancy" matches. It ends with "c" (not one of aeiou), followed by "y"

"pita" does not match: it does not end with "y".


> print()
>
> in_string = 'vacancy'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of
> {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of
> {0}.".format(in_string))
> print()
>
> in_string = 'boy'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of
> {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of
> {0}.".format(in_string))
> print()
>
> in_string = 'day'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of
> {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of
> {0}.".format(in_string))
> print()
>
> in_string = 'pita'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of
> {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of
> {0}.".format(in_string))
> print()
>
> The output that I am getting is:
> ay, ey, iy, oy and uy are not at the end of vacancy.
> ay, ey, iy, oy or uy were found at the end of boy.
> ay, ey, iy, oy or uy were found at the end of day.
> ay, ey, iy, oy or uy were found at the end of pita.
>
> The last line of the output is the opposite of what I expected to see,
> and I'm having trouble figuring out what the issue is. Any help would
> be greatly appreciated.
From: MRAB on
genxtech wrote:
> I am trying to learn regular expressions in python3 and have an issue
> with one of the examples I'm working with.
> The code is:
>
> #! /usr/bin/env python3
>
> import re
>
> search_string = "[^aeiou]y$"

You can think of this as: a non-vowel followed by a 'y', then the end of
the string.

> print()
>
> in_string = 'vacancy'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Matches because 'c' is a non-vowel, 'y' matches, and then the end of the
string.

> print()
>
> in_string = 'boy'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Doesn't match because 'o' is a vowel, not a non-vowel.

> print()
>
> in_string = 'day'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Doesn't match because 'a' is a vowel, not a non-vowel.

> print()
>
> in_string = 'pita'
> if re.search(search_string, in_string) != None:
> print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
> else:
> print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Doesn't match because 't' is a non-vowel but 'a' doesn't match 'y'.

> print()
>
> The output that I am getting is:
> ay, ey, iy, oy and uy are not at the end of vacancy.
> ay, ey, iy, oy or uy were found at the end of boy.
> ay, ey, iy, oy or uy were found at the end of day.
> ay, ey, iy, oy or uy were found at the end of pita.
>
> The last line of the output is the opposite of what I expected to see,
> and I'm having trouble figuring out what the issue is. Any help would
> be greatly appreciated.

From: Chris Rebert on
On Sun, Aug 8, 2010 at 3:32 PM, Thomas Jollans <thomas(a)jollybox.de> wrote:
> On Monday 09 August 2010, it occurred to genxtech to exclaim:
>> I am trying to learn regular expressions in python3 and have an issue
>> with one of the examples I'm working with.
>> The code is:
>>
>> #! /usr/bin/env python3
>>
>> import re
>>
>> search_string = "[^aeiou]y$"
>
> To translate this expression to English:
>
> a character that is not a, e, i, o, or u, followed by the character 'y', at
> the end of the line.
>
> "vacancy" matches. It ends with "c" (not one of aeiou), followed by "y"
>
> "pita" does not match: it does not end with "y".

Or in other words, the regex will not match when:
- the string ends in "ay", "ey", "iy", "oy", or "uy"
- the string doesn't end in "y"
- the string is less than 2 characters long

So, the program has a logic error in its assumptions. A non-match
*doesn't* imply that a string ends in one of the aforementioned pairs;
the other possibilities have been overlooked.

May I suggest instead using the much more straightforward
`search_string = "[aeiou]y$"` and then swapping your conditions
around? The double-negative sort of style the program is currently
using is (as you've just experienced) harder to reason about and thus
more error-prone.

Cheers,
Chris
--
http://blog.rebertia.com
From: Tim Chase on
On 08/08/10 17:20, genxtech wrote:
> if re.search(search_string, in_string) != None:

While the other responses have addressed some of the big issues,
it's also good to use

if thing_to_test is None:

or

if thing_to_test is not None:

instead of "== None" or "!= None".

-tkc