From: John Posner on
On Fri, 08 Jan 2010 14:28:57 -0500, MRAB <python(a)mrabarnett.plus.com>
wrote:

> Dave McCormick wrote:
>> On Wed, Jan 6, 2010 at 9:18 AM, John Posner <jjposner(a)optimum.net
>> <mailto:jjposner(a)optimum.net>> wrote:
>> On Tue, 05 Jan 2010 16:54:44 -0500, Dave McCormick
>> <mackrackit(a)gmail.com <mailto:mackrackit(a)gmail.com>> wrote:
>> But it is not what I am wanting. I first thought to make it
>> look
>> for a space but that would not work when a single character like
>> "#" is to be colored if there is a "string" of them. Or if all
>> of the characters between quotes are to be colored.
>> Regular expressions are good at handling searches like:
>> * all the characters between quotes
>> * the two-character string "do", but only if it's a complete word
>> -John
>> -- http://mail.python.org/mailman/listinfo/python-list
>> I need another hint...
>> Been doing some reading and playing and it looks like
>> r'\bxxx\b'
>> is what I need. But I can not figure out how to pass a variable between
>> \b___\b
>> If the word in question is between the "\b \b" and in the list then it
>> works like I want it to.
>> The below does not work.
>> greenList_regexp = "|".join(greenList)
>> for matchobj in re.finditer(r'\bgreenList_regexp\b', complete_text):
>> start,end = matchobj.span()
>>
> The regex r'\bgreenList_regexp\b' will match the string
> 'greenList_regexp' if it's a whole word.
>
> What you mean is "any of these words, provided that they're whole
> words". You'll need to group the alternatives within "(?:...)", like
> this:
>
> r'\b(?:' + greenList_regexp + ')\b'

Oops, MRAB, you forgot to make the last literal a RAW string -- it should
be r')\b'

Dave, we're already into some pretty heavy regular-expression work, huh?.
Here's another approach -- not nearly as elegant as MRAB's:

Given this list:

greenList = ['green', 'grass', 'grump']

.... you currently are using join() to construct this regexp search string:

'green|grass|grump'

.... but you've decided that you really want this similar regexp search
string:

r'\bgreen\b|\bgrass\b|\bgrump\b'

You can achieve this by transforming each item on the list, then invoking
join() on the transformed list to create the search string. Here are a
couple of ways to transform the list:

* List comprehension:

whole_word_greenList = [ r'\b' + word + r'\b' for word in greenList]

* map() and a user-defined function:

def xform_to_wholeword_searchstring(word):
return r'\b' + word + r'\b'

whole_word_greenList = map(xform_to_wholeword_searchstring, greenList)


HTH,
John
From: Dave McCormick on


John Posner wrote:
> On Fri, 08 Jan 2010 14:28:57 -0500, MRAB <python(a)mrabarnett.plus.com>
> wrote:
>
>> The regex r'\bgreenList_regexp\b' will match the string
>> 'greenList_regexp' if it's a whole word.
>>
>> What you mean is "any of these words, provided that they're whole
>> words". You'll need to group the alternatives within "(?:...)", like
>> this:
>>
>> r'\b(?:' + greenList_regexp + ')\b'
>
> Oops, MRAB, you forgot to make the last literal a RAW string -- it
> should be r')\b'
>
> Dave, we're already into some pretty heavy regular-expression work,
> huh?. Here's another approach -- not nearly as elegant as MRAB's:
>
> Given this list:
>
> greenList = ['green', 'grass', 'grump']
>
> ... you currently are using join() to construct this regexp search
> string:
>
> 'green|grass|grump'
>
> ... but you've decided that you really want this similar regexp search
> string:
>
> r'\bgreen\b|\bgrass\b|\bgrump\b'
>
> You can achieve this by transforming each item on the list, then
> invoking join() on the transformed list to create the search string.
> Here are a couple of ways to transform the list:
>
> * List comprehension:
>
> whole_word_greenList = [ r'\b' + word + r'\b' for word in greenList]
>
> * map() and a user-defined function:
>
> def xform_to_wholeword_searchstring(word):
> return r'\b' + word + r'\b'
>
> whole_word_greenList = map(xform_to_wholeword_searchstring, greenList)
>
>
> HTH,
> John
John,
That second "r" appears to do the trick.

Yea, pretty heavy into it. I read someplace that regular-expressions
were tricky, but I did not expect this :)

Now to start working this into the rest of my app and study your second
approach.

Thanks again for the help!!!
Dave