From: Paul Halliday on
I am working on a parser for logs from a spam firewall. The format is
predictable until it reaches a certain point. It then varies greatly.

There are 2 things I want to grab from this area; the size of the
message (if it exists) and the subject (if it exists)

The line might look something like this:

- 2 39 some.text.here SZ:1825 SUBJ: A subject here

but it could also look like this:

5 6 421 Error: timeout

or this:

5 6 421 Client disconnected

All I really want is the value for each, not the prefix stuff. Which
means I still need more below, yuck.

I am doing it like this:

$remainder = explode(" ", $theLine, 18);
$s_size = '/SZ:\d+/';
$s_subject = '/SUBJ:.+/';

preg_match("$s_size","$remainder[17]",$a);
preg_match("$s_subject","$remainder[17]",$b);

if (count($a) > 0) {
$size = $a[0];
} else {
$size = 0;
}

if (count($b) > 0) {
$subject = $b[0];

} else {
$subject = "-";
}

Is there any way to clean this up a bit?

thanks.
From: Richard Quadling on
On 25 March 2010 16:42, Paul Halliday <paul.halliday(a)gmail.com> wrote:
> I am working on a parser for logs from a spam firewall. The format is
> predictable until it reaches a certain point. It then varies greatly.
>
> There are 2 things I want to grab from this area; the size of the
> message (if it exists) and the subject (if it exists)
>
> The line might look something like this:
>
> - 2 39 some.text.here SZ:1825 SUBJ: A subject here
>
> but it could also look like this:
>
> 5 6 421 Error: timeout
>
> or this:
>
> 5 6 421 Client disconnected
>
> All I really want is the value for each, not the prefix stuff. Which
> means I still need more below, yuck.
>
> I am doing it like this:
>
> $remainder = explode(" ", $theLine, 18);
>                $s_size = '/SZ:\d+/';
>                $s_subject = '/SUBJ:.+/';
>
>                preg_match("$s_size","$remainder[17]",$a);
>                preg_match("$s_subject","$remainder[17]",$b);
>
>                if (count($a) > 0) {
>                    $size = $a[0];
>                } else {
>                    $size = 0;
>                }
>
>                if (count($b) > 0) {
>                    $subject = $b[0];
>
>                } else {
>                    $subject = "-";
>                }
>
> Is there any way to clean this up a bit?
>
> thanks.
>
> --
> PHP General Mailing List (http://www.php.net/)
> To unsubscribe, visit: http://www.php.net/unsub.php
>
>

You can do it handraulically using string functions. You can also use
regular expressions.

If you can supply a sensible chunk, I can build a regex for you.
Indicate the exact elements you want to retrieve.

If you want to email me the log file directly that's fine.

--
-----
Richard Quadling
"Standing on the shoulders of some very clever giants!"
EE : http://www.experts-exchange.com/M_248814.html
EE4Free : http://www.experts-exchange.com/becomeAnExpert.jsp
Zend Certified Engineer : http://zend.com/zce.php?c=ZEND002498&r=213474731
ZOPA : http://uk.zopa.com/member/RQuadling
From: Per Jessen on
Paul Halliday wrote:

>=20
> Is there any way to clean this up a bit?
>=20

This is what I usually do:

if ( ($matches=3Dpreg_match(linepattern1,text,match))>0 )
{
// do stuff speicifc to linepattern1
}
else
if ( ($matches=3Dpreg_match(linepattern2,text,match))>0 )
{
// do stuff speicifc to linepattern2
}
else
if ( ($matches=3Dpreg_match(linepattern3,text,match))>0 )
{
}
else
if ( ($matches=3Dpreg_match(linepattern4,text,match))>0 )
{
}



--=20
Per Jessen, Z=C3=BCrich (15.9=C2=B0C)