From: Jason Carlton on
On Mar 25, 5:45 pm, "J. Gleixner" <glex_no-s...(a)qwest-spam-no.invalid>
wrote:
> JasonCarltonwrote:
> > On Mar 9, 11:49 pm,JasonCarlton<jwcarl...(a)gmail.com> wrote:
> >> On Mar 9, 9:21 pm, s...(a)netherlands.com wrote:
>
> >>> On Mon, 8 Mar 2010 19:03:03 -0800 (PST),JasonCarlton<jwcarl...(a)gmail.com> wrote:
> >>>> Every once in awhile, someone will copy and paste into my message
> >>>> board from Word. After it submits through my Perl script, I'll have
> >>>> something like this plugged in:
> >>>> Normal 0 false false false EN-US X-NONE X-NONE
> >>>> MicrosoftInternetExplorer4 /* Style Definitions */
> >>>> table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-
> >>>> rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-
> >>>> style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-
> >>>> padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-
> >>>> margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:
> >>>> 0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt;
> >>>> font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-
> >>>> ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New
> >>>> Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-
> >>>> family:Calibri; mso-hansi-theme-font:minor-latin;}
> >>>> The fonts and all that are different for each post; the only
> >>>> consistency seems to be that it starts with "Normal 0 false false
> >>>> false", and it ends with a "}".
> >>>> Would something as simple as this be enough to consistently remove it?
> >>>> $comment =~ s/Normal 0 false false false.*?}//gsi;
> >>>> Or is there more to it than I'm thinking?
> >>> $comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;
> >> Thanks, s.
>
> > Unfortunately, neither of these are working the way I expected:
>
> > $comment =~ s/Normal 0 false false false.*?}//gsi;
> > $comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;
>
> > It's catching the "Normal 0 false false false", but not everything
> > else that comes after, and before the "}".
>
> > How do I make it remove everything from "Normal 0 false false false"
> > until it finds the first "}"?
>
> $comment =~ s/Normal 0 false false false[^}]*}//gsi;
>
> my $str = 'Start Normal 0 false false false blah blah { more blah }
> Starting second match Normal 0 false false false blah blah { more blah }
> The End';
> $str =~ s/Normal 0 false false false[^}]*}//gsi;
> print $str;
>
> Start  Starting second match  The End

J, should that first "}" be a "{"? Like:

$str =~ s/Normal 0 false false false[^{]*}//gsi;
From: J. Gleixner on
Jason Carlton wrote:
[...]
>>>>>> The fonts and all that are different for each post; the only
>>>>>> consistency seems to be that it starts with "Normal 0 false false
>>>>>> false", and it ends with a "}".
>>>>>> Would something as simple as this be enough to consistently
remove it?
[...]
> J, should that first "}" be a "{"? Like:
> $str =~ s/Normal 0 false false false[^{]*}//gsi;

Before asking if it's not correct, why not try it?

[^}]* - match everything until it sees '}'
} - include '}' in the pattern. -- without that you'll
have '}' in your results.

I gave example text, and the output it generates, if that
doesn't match what you want, then please be a little
more verbose. Provide a -short- example of the text before,
and what you want the text to be after doing something to it.