From: Richard Lamboj on

Hello,

i want to parse this String:

version 3.5.1 {

$pid_dir = /opt/samba-3.5.1/var/locks/
$bin_dir = /opt/samba-3.5.1/bin/

service smbd {
bin = ${bin_dir}smbd -D
pid = ${pid_dir}smbd.pid
}
service nmbd {
bin = ${bin_dir}nmbd -D
pid = ${pid_dir}nmbd.pid
}
service winbindd {
bin = ${bin_dir}winbindd -D
pid = ${pid_dir}winbindd.pid
}
}

version 3.2.14 {

$pid_dir = /opt/samba-3.5.1/var/locks/
$bin_dir = /opt/samba-3.5.1/bin/

service smbd {
bin = ${bin_dir}smbd -D
pid = ${pid_dir}smbd.pid
}
service nmbd {
bin = ${bin_dir}nmbd -D
pid = ${pid_dir}nmbd.pid
}
service winbindd {
bin = ${bin_dir}winbindd -D
pid = ${pid_dir}winbindd.pid
}
}

Step 1:

version 3.2.14 {

$pid_dir = /opt/samba-3.5.1/var/locks/
$bin_dir = /opt/samba-3.5.1/bin/

service smbd {
bin = ${bin_dir}smbd -D
pid = ${pid_dir}smbd.pid
}
service nmbd {
bin = ${bin_dir}nmbd -D
pid = ${pid_dir}nmbd.pid
}
service winbindd {
bin = ${bin_dir}winbindd -D
pid = ${pid_dir}winbindd.pid
}
}

Step 2:
service smbd {
bin = ${bin_dir}smbd -D
pid = ${pid_dir}smbd.pid
}
Step 3:
$pid_dir = /opt/samba-3.5.1/var/locks/
$bin_dir = /opt/samba-3.5.1/bin/

Step 4:
bin = ${bin_dir}smbd -D
pid = ${pid_dir}smbd.pid

My Regular Expressions:
version[\s]*[\w\.]*[\s]*\{[\w\s\n\t\{\}=\$\.\-_\/]*\}
service[\s]*[\w]*[\s]*\{([\n\s\w\=]*(\$\{[\w_]*\})*[\w\s\-=\.]*)*\}

I think it was no good Solution. I'am trying with Groups:
(service[\s\w]*)\{([\n\w\s=\$\-_\.]*)
but this part makes Problems: ${bin_dir}

Kind Regards

Richi
From: Chris Rebert on
On Wed, Apr 7, 2010 at 1:37 AM, Richard Lamboj <richard.lamboj(a)bilcom.at> wrote:
> i want to parse this String:
>
> version 3.5.1 {
>
>        $pid_dir = /opt/samba-3.5.1/var/locks/
>        $bin_dir = /opt/samba-3.5.1/bin/
>
>        service smbd {
>                bin = ${bin_dir}smbd -D
>                pid = ${pid_dir}smbd.pid
>        }
>        service nmbd {
>                bin = ${bin_dir}nmbd -D
>                pid = ${pid_dir}nmbd.pid
>        }
>        service winbindd {
>                bin = ${bin_dir}winbindd -D
>                pid = ${pid_dir}winbindd.pid
>        }
> }
>
> version 3.2.14 {
>
>        $pid_dir = /opt/samba-3.5.1/var/locks/
>        $bin_dir = /opt/samba-3.5.1/bin/
>
>        service smbd {
>                bin = ${bin_dir}smbd -D
>                pid = ${pid_dir}smbd.pid
>        }
>        service nmbd {
>                bin = ${bin_dir}nmbd -D
>                pid = ${pid_dir}nmbd.pid
>        }
>        service winbindd {
>                bin = ${bin_dir}winbindd -D
>                pid = ${pid_dir}winbindd.pid
>        }
> }
>
> Step 1:
>
> version 3.2.14 {
>
>        $pid_dir = /opt/samba-3.5.1/var/locks/
>        $bin_dir = /opt/samba-3.5.1/bin/
>
>        service smbd {
>                bin = ${bin_dir}smbd -D
>                pid = ${pid_dir}smbd.pid
>        }
>        service nmbd {
>                bin = ${bin_dir}nmbd -D
>                pid = ${pid_dir}nmbd.pid
>        }
>        service winbindd {
>                bin = ${bin_dir}winbindd -D
>                pid = ${pid_dir}winbindd.pid
>        }
> }
>
> Step 2:
>        service smbd {
>                bin = ${bin_dir}smbd -D
>                pid = ${pid_dir}smbd.pid
>        }
> Step 3:
>        $pid_dir = /opt/samba-3.5.1/var/locks/
>        $bin_dir = /opt/samba-3.5.1/bin/
>
> Step 4:
>                bin = ${bin_dir}smbd -D
>                pid = ${pid_dir}smbd.pid
>
> My Regular Expressions:
> version[\s]*[\w\.]*[\s]*\{[\w\s\n\t\{\}=\$\.\-_\/]*\}
> service[\s]*[\w]*[\s]*\{([\n\s\w\=]*(\$\{[\w_]*\})*[\w\s\-=\.]*)*\}
>
> I think it was no good Solution. I'am trying with Groups:
> (service[\s\w]*)\{([\n\w\s=\$\-_\.]*)
> but this part makes Problems: ${bin_dir}

Regular expressions != Parsers

Every time someone tries to parse nested structures using regular
expressions, Jamie Zawinski kills a puppy.

Try using an *actual* parser, such as Pyparsing:
http://pyparsing.wikispaces.com/

Cheers,
Chris
--
Some people, when confronted with a problem, think:
"I know, I'll use regular expressions." Now they have two problems.
http://blog.rebertia.com
From: Bruno Desthuilliers on
Richard Lamboj a �crit :
> Hello,
>
> i want to parse this String:
>
> version 3.5.1 {
>
> $pid_dir = /opt/samba-3.5.1/var/locks/
> $bin_dir = /opt/samba-3.5.1/bin/
>
> service smbd {
> bin = ${bin_dir}smbd -D
> pid = ${pid_dir}smbd.pid
> }
> service nmbd {
> bin = ${bin_dir}nmbd -D
> pid = ${pid_dir}nmbd.pid
> }
> service winbindd {
> bin = ${bin_dir}winbindd -D
> pid = ${pid_dir}winbindd.pid
> }
> }

(snip)

I think you'd be better writing a specific parser here. Paul McGuire's
PyParsing package might help:

http://pyparsing.wikispaces.com/

My 2 cents.
From: Richard Lamboj on
Am Wednesday 07 April 2010 10:52:14 schrieb Chris Rebert:
> On Wed, Apr 7, 2010 at 1:37 AM, Richard Lamboj <richard.lamboj(a)bilcom.at>
wrote:
> > i want to parse this String:
> >
> > version 3.5.1 {
> >
> >        $pid_dir = /opt/samba-3.5.1/var/locks/
> >        $bin_dir = /opt/samba-3.5.1/bin/
> >
> >        service smbd {
> >                bin = ${bin_dir}smbd -D
> >                pid = ${pid_dir}smbd.pid
> >        }
> >        service nmbd {
> >                bin = ${bin_dir}nmbd -D
> >                pid = ${pid_dir}nmbd.pid
> >        }
> >        service winbindd {
> >                bin = ${bin_dir}winbindd -D
> >                pid = ${pid_dir}winbindd.pid
> >        }
> > }
> >
> > version 3.2.14 {
> >
> >        $pid_dir = /opt/samba-3.5.1/var/locks/
> >        $bin_dir = /opt/samba-3.5.1/bin/
> >
> >        service smbd {
> >                bin = ${bin_dir}smbd -D
> >                pid = ${pid_dir}smbd.pid
> >        }
> >        service nmbd {
> >                bin = ${bin_dir}nmbd -D
> >                pid = ${pid_dir}nmbd.pid
> >        }
> >        service winbindd {
> >                bin = ${bin_dir}winbindd -D
> >                pid = ${pid_dir}winbindd.pid
> >        }
> > }
> >
> > Step 1:
> >
> > version 3.2.14 {
> >
> >        $pid_dir = /opt/samba-3.5.1/var/locks/
> >        $bin_dir = /opt/samba-3.5.1/bin/
> >
> >        service smbd {
> >                bin = ${bin_dir}smbd -D
> >                pid = ${pid_dir}smbd.pid
> >        }
> >        service nmbd {
> >                bin = ${bin_dir}nmbd -D
> >                pid = ${pid_dir}nmbd.pid
> >        }
> >        service winbindd {
> >                bin = ${bin_dir}winbindd -D
> >                pid = ${pid_dir}winbindd.pid
> >        }
> > }
> >
> > Step 2:
> >        service smbd {
> >                bin = ${bin_dir}smbd -D
> >                pid = ${pid_dir}smbd.pid
> >        }
> > Step 3:
> >        $pid_dir = /opt/samba-3.5.1/var/locks/
> >        $bin_dir = /opt/samba-3.5.1/bin/
> >
> > Step 4:
> >                bin = ${bin_dir}smbd -D
> >                pid = ${pid_dir}smbd.pid
> >
> > My Regular Expressions:
> > version[\s]*[\w\.]*[\s]*\{[\w\s\n\t\{\}=\$\.\-_\/]*\}
> > service[\s]*[\w]*[\s]*\{([\n\s\w\=]*(\$\{[\w_]*\})*[\w\s\-=\.]*)*\}
> >
> > I think it was no good Solution. I'am trying with Groups:
> > (service[\s\w]*)\{([\n\w\s=\$\-_\.]*)
> > but this part makes Problems: ${bin_dir}
>
> Regular expressions != Parsers
>
> Every time someone tries to parse nested structures using regular
> expressions, Jamie Zawinski kills a puppy.
>
> Try using an *actual* parser, such as Pyparsing:
> http://pyparsing.wikispaces.com/
>
> Cheers,
> Chris
> --
> Some people, when confronted with a problem, think:
> "I know, I'll use regular expressions." Now they have two problems.
> http://blog.rebertia.com

Well, after some trying with regex, your both right. I will use pyparse it
seems to be the better solution.

Kind Regards
From: Patrick Maupin on
On Apr 7, 3:52 am, Chris Rebert <c...(a)rebertia.com> wrote:

> Regular expressions != Parsers

True, but lots of parsers *use* regular expressions in their
tokenizers. In fact, if you have a pure Python parser, you can often
get huge performance gains by rearranging your code slightly so that
you can use regular expressions in your tokenizer, because that
effectively gives you access to a fast, specialized C library that is
built into practically every Python interpreter on the planet.

> Every time someone tries to parse nested structures using regular
> expressions, Jamie Zawinski kills a puppy.

And yet, if you are parsing stuff in Python, and your parser doesn't
use some specialized C code for tokenization (which will probably be
regular expressions unless you are using mxtexttools or some other
specialized C tokenizer code), your nested structure parser will be
dog slow.

Now, for some applications, the speed just doesn't matter, and for
people who don't yet know the difference between regexps and parsing,
pointing them at PyParsing is certainly doing them a valuable service.

But that's twice today when I've seen people warned off regular
expressions without a cogent explanation that, while the re module is
good at what it does, it really only handles the very lowest level of
a parsing problem.

My 2 cents is that something like PyParsing is absolutely great for
people who want a simple parser without a lot of work. But if people
use PyParsing, and then find out that (for their particular
application) it isn't fast enough, and then wonder what to do about
it, if all they remember is that somebody told them not to use regular
expressions, they will just come to the false conclusion that pure
Python is too painfully slow for any real world task.

Regards,
Pat