Looking for recommended approach to using Regular Expressions. [MFC]

Prev: Problem handling keyboard events in ActiveX on IE8
Next: project dependencies: satellite dlls and main-project

From: Joseph M. Newcomer on 23 Jun 2010 15:33

See below...
On Wed, 23 Jun 2010 15:25:22 -0400, "Pete Delgado" <Peter.Delgado(a)NoSpam.com> wrote:

>
>"Joseph M. Newcomer" <newcomer(a)flounder.com> wrote in message
>news:cqj426tf8a7iiggmrdv3boe0u6043uuiio(a)4ax.com...
>> But it is so trivial to write an FSM parser; for the cases cited, I can
>> write an FSM
>> parser as fast as I can type. A regexp is technological overkill,
>> particularly because
>> the flexibility of a programmable pattern is not required!
>> joe
>
>Joe,
>If you take into account internationalization and the differences in format
>pattern required for the fields cited, the FSM approach becomes more complex
>than using one of the regular expression libraries. Of course, I'm assuming
>that the author is familiar with writing validating regular expressions
>because it is just as easy to write a buggy expression as it is to write
>buggy code!
****
And that means someone has to translate the date-time format of the default user locale to
a regexp...
joe
****
>
>However, if this particular piece of code will *never* be used anywhere else
>except where the author intends, then I see the wisdom in your approach and
>completely agree with your conclusion that a regexp is overkill and cannot
>handle incomplete data cases while your method can.
>
>-Pete
>
Joseph M. Newcomer [MVP]
email: newcomer(a)flounder.com
Web: http://www.flounder.com
MVP Tips: http://www.flounder.com/mvp_tips.htm

From: Jeff Flinn on 24 Jun 2010 08:00

Joseph M. Newcomer wrote:
> See below...
> On Tue, 22 Jun 2010 10:33:28 -0700, Cameron_C <CameronC(a)discussions.microsoft.com> wrote:
>
>> Is there any performance differences among the rainbow of choices?
> ****
> Why do you think it matters? How many millions of matches are you going to need to make
> for each regexp? (Note: if you cannot express it in terms of integer multiples of
> millions, the performance probably won't matter)
> ****
>> Or all about the same?
>> I am guessing that the Feature Pack implementation would be the best choice
>> of direction overall, since it is incorporated into the MFC framework?
> ****
> Avoid anything nonstandard. So the TR1 design (which probably involved intelligent,
> thinking human beings) should be a reasonable choice.

The C++ TR1 design was driven by John Maddock's boost RegEx lib, in fact
there is a TR1 lib in boost as well that contains . Yes, he's an
intelligent thinking human. Each compiler manufacturer provides it's own
implementation, so I'm not sure who developed MS's implementation.

Boost also has the Xpressive lib developed by Eric Niebler who developed
Greta, a regex engine, while at Microsoft. Xpressive has both dynamic
and static regex engines. The latter is implemented as a Domain Specific
Embedded Language. This allows you to have your regular expression
evaluated at compile time.

There was a recent thread on the boost developer mailing list concerning
regex performance. They mentioned iregexp (sp?) from google as a top
performer, along with xpressive and regex.

Also the OP might look to see if the boost spirit parser library might
also address his needs.

Jeff

First | Prev |
Pages: 1 2 3 4
Prev: Problem handling keyboard events in ActiveX on IE8
Next: project dependencies: satellite dlls and main-project