From: Tad McClellan on
Jason Carlton <jwcarlton(a)gmail.com> wrote:
> On Mar 9, 8:30 pm, Tad McClellan <ta...(a)seesig.invalid> wrote:
>> Jason Carlton <jwcarl...(a)gmail.com> wrote:
>> > Sorry if I made that too much to read.
>>
>> You've shown in the past that anything you write is too much to read.
>>
>> :-(
>>
>> --
>> Tad McClellan
>> email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
>> The above message is a Usenet post.
>> I don't recall having given anyone permission to use it on a Web site.


It is bad netiquette to quote .sigs.


> So, you're saying that you don't know the answer?


No, I'm saying that I am withholding the answer.


--
Tad McClellan
email: perl -le "print scalar reverse qq/moc.liamg\100cm.j.dat/"
The above message is a Usenet post.
I don't recall having given anyone permission to use it on a Web site.
From: Jason Carlton on
On Mar 9, 9:21 pm, s...(a)netherlands.com wrote:
> On Mon, 8 Mar 2010 19:03:03 -0800 (PST), Jason Carlton <jwcarl...(a)gmail.com> wrote:
> >Every once in awhile, someone will copy and paste into my message
> >board from Word. After it submits through my Perl script, I'll have
> >something like this plugged in:
>
> >Normal 0 false false false EN-US X-NONE X-NONE
> >MicrosoftInternetExplorer4 /* Style Definitions */
> >table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-
> >rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-
> >style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-
> >padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-
> >margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:
> >0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt;
> >font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-
> >ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New
> >Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-
> >family:Calibri; mso-hansi-theme-font:minor-latin;}
>
> >The fonts and all that are different for each post; the only
> >consistency seems to be that it starts with "Normal 0 false false
> >false", and it ends with a "}".
>
> >Would something as simple as this be enough to consistently remove it?
>
> >$comment =~ s/Normal 0 false false false.*?}//gsi;
>
> >Or is there more to it than I'm thinking?
>
> $comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;

Thanks, s.
From: Jason Carlton on
On Mar 9, 11:49 pm, Jason Carlton <jwcarl...(a)gmail.com> wrote:
> On Mar 9, 9:21 pm, s...(a)netherlands.com wrote:
>
>
>
>
>
> > On Mon, 8 Mar 2010 19:03:03 -0800 (PST),JasonCarlton<jwcarl...(a)gmail.com> wrote:
> > >Every once in awhile, someone will copy and paste into my message
> > >board from Word. After it submits through my Perl script, I'll have
> > >something like this plugged in:
>
> > >Normal 0 false false false EN-US X-NONE X-NONE
> > >MicrosoftInternetExplorer4 /* Style Definitions */
> > >table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-
> > >rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-
> > >style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-
> > >padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-
> > >margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:
> > >0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt;
> > >font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-
> > >ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New
> > >Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-
> > >family:Calibri; mso-hansi-theme-font:minor-latin;}
>
> > >The fonts and all that are different for each post; the only
> > >consistency seems to be that it starts with "Normal 0 false false
> > >false", and it ends with a "}".
>
> > >Would something as simple as this be enough to consistently remove it?
>
> > >$comment =~ s/Normal 0 false false false.*?}//gsi;
>
> > >Or is there more to it than I'm thinking?
>
> > $comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;
>
> Thanks, s.

Unfortunately, neither of these are working the way I expected:

$comment =~ s/Normal 0 false false false.*?}//gsi;
$comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;

It's catching the "Normal 0 false false false", but not everything
else that comes after, and before the "}".

How do I make it remove everything from "Normal 0 false false false"
until it finds the first "}"?

TIA,

Jason
From: sln on
On Thu, 25 Mar 2010 10:41:09 -0700 (PDT), Jason Carlton <jwcarlton(a)gmail.com> wrote:

>On Mar 9, 11:49�pm, Jason Carlton <jwcarl...(a)gmail.com> wrote:
>> On Mar 9, 9:21�pm, s...(a)netherlands.com wrote:
>>
>>
>>
>>
>>
>> > On Mon, 8 Mar 2010 19:03:03 -0800 (PST),JasonCarlton<jwcarl...(a)gmail.com> wrote:
>> > >Every once in awhile, someone will copy and paste into my message
>> > >board from Word. After it submits through my Perl script, I'll have
>> > >something like this plugged in:
>>
>> > >Normal 0 false false false EN-US X-NONE X-NONE
>> > >MicrosoftInternetExplorer4 /* Style Definitions */
>> > >table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-
>> > >rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-
>> > >style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-
>> > >padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-
>> > >margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:
>> > >0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt;
>> > >font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-
>> > >ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New
>> > >Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-
>> > >family:Calibri; mso-hansi-theme-font:minor-latin;}
>>
>> > >The fonts and all that are different for each post; the only
>> > >consistency seems to be that it starts with "Normal 0 false false
>> > >false", and it ends with a "}".
>>
>> > >Would something as simple as this be enough to consistently remove it?
>>
>> > >$comment =~ s/Normal 0 false false false.*?}//gsi;
>>
>> > >Or is there more to it than I'm thinking?
>>
>> > $comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;
>>
>> Thanks, s.
>
>Unfortunately, neither of these are working the way I expected:
>
>$comment =~ s/Normal 0 false false false.*?}//gsi;
>$comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;
>
>It's catching the "Normal 0 false false false", but not everything
>else that comes after, and before the "}".
>
>How do I make it remove everything from "Normal 0 false false false"
>until it finds the first "}"?
>
>TIA,
>
>Jason

You can generalize it more:

$comment =~ s/Normal \s* \d+ \s* false \s* false \s* false [^}]* \} //xig;

But, its probably not matching, so the format is different, maybe there
is no terminating '}' in the real text. You don't need /s if you don't have
a '.' in the pattern, thats why [^}]* \}

Its not a good idea to get everything between the the "Normal" to "}"
as thats not really enough info to make a pattern.

It looks like this:
Normal 0 false false false EN-US X-NONE X-NONE MicrosoftInternetExplorer4
is a space delimited set of variable settings, followed by
a '{' block '}' delimeted set of style definitions:

You could use alternation to flag the start the definition if you
know the possible values (the slots look constant), so:

$comment =~ s/ (?:Normal|<something else>) \s* \d+ \s* (?:false|true) \s* (?:false|true) \s* (?:false|true) [^}]* \} //xig;

But, I don't know this format and it possibly can't be relied upon.
Also, the regex has a requirement that it have a style block (or at least something
with a '}' as the terminator.

-sln
From: J. Gleixner on
Jason Carlton wrote:
> On Mar 9, 11:49 pm, Jason Carlton <jwcarl...(a)gmail.com> wrote:
>> On Mar 9, 9:21 pm, s...(a)netherlands.com wrote:
>>
>>
>>
>>
>>
>>> On Mon, 8 Mar 2010 19:03:03 -0800 (PST),JasonCarlton<jwcarl...(a)gmail.com> wrote:
>>>> Every once in awhile, someone will copy and paste into my message
>>>> board from Word. After it submits through my Perl script, I'll have
>>>> something like this plugged in:
>>>> Normal 0 false false false EN-US X-NONE X-NONE
>>>> MicrosoftInternetExplorer4 /* Style Definitions */
>>>> table.MsoNormalTable {mso-style-name:"Table Normal"; mso-tstyle-
>>>> rowband-size:0; mso-tstyle-colband-size:0; mso-style-noshow:yes; mso-
>>>> style-priority:99; mso-style-qformat:yes; mso-style-parent:""; mso-
>>>> padding-alt:0in 5.4pt 0in 5.4pt; mso-para-margin-top:0in; mso-para-
>>>> margin-right:0in; mso-para-margin-bottom:10.0pt; mso-para-margin-left:
>>>> 0in; line-height:115%; mso-pagination:widow-orphan; font-size:11.0pt;
>>>> font-family:"Calibri","sans-serif"; mso-ascii-font-family:Calibri; mso-
>>>> ascii-theme-font:minor-latin; mso-fareast-font-family:"Times New
>>>> Roman"; mso-fareast-theme-font:minor-fareast; mso-hansi-font-
>>>> family:Calibri; mso-hansi-theme-font:minor-latin;}
>>>> The fonts and all that are different for each post; the only
>>>> consistency seems to be that it starts with "Normal 0 false false
>>>> false", and it ends with a "}".
>>>> Would something as simple as this be enough to consistently remove it?
>>>> $comment =~ s/Normal 0 false false false.*?}//gsi;
>>>> Or is there more to it than I'm thinking?
>>> $comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;
>> Thanks, s.
>
> Unfortunately, neither of these are working the way I expected:
>
> $comment =~ s/Normal 0 false false false.*?}//gsi;
> $comment =~ s/Normal 0 false false false[^{]+\{[^}]+\}//;
>
> It's catching the "Normal 0 false false false", but not everything
> else that comes after, and before the "}".
>
> How do I make it remove everything from "Normal 0 false false false"
> until it finds the first "}"?

$comment =~ s/Normal 0 false false false[^}]*}//gsi;

my $str = 'Start Normal 0 false false false blah blah { more blah }
Starting second match Normal 0 false false false blah blah { more blah }
The End';
$str =~ s/Normal 0 false false false[^}]*}//gsi;
print $str;

Start Starting second match The End