Better spam filter for postfix [Postfix]

Prev: Simple Hack To Get $2000 To Your PayPal Account
Next: null client doc

From: "Steve" on 15 Jul 2010 15:29

-------- Original-Nachricht --------
> Datum: Thu, 15 Jul 2010 12:03:17 -0700
> Von: Bradley Giesbrecht <bradley.giesbrecht(a)gmail.com>
> An: postfix-users <postfix-users(a)postfix.org>
> Betreff: Re: Better spam filter for postfix

> Or sqlgrey, a fork of postgrey.
>
> http://sqlgrey.sourceforge.net/
>
Or GROSS (the only greylisting application that I know working with a bloom filter (http://en.wikipedia.org/wiki/Bloom_filter)).

http://code.google.com/p/gross/

> On Jul 15, 2010, at 11:59 AM, Kai Krakow wrote:
>
> > Use greylisting, eg postgrey and set it up to work before amavisd-new
> > or mailscanner.
> >
> > 2010/7/15 Josh Cason <jocaso(a)mychoice.cc>
> >>
> >> As most of you guys know. I use mailscanner. I would like
> >> recomendations of what else to use. I prefer a all in one package
> >> like what mailscanner does. It also utilizes clamav and spamassion.
> >> The problem is most of the information I find on the net is
> >> outdated or for projects that stops. Seems like everybody has there
> >> way of dealing wiht spam filterting. So This is a ask of what you
> >> guys find the most usefull. I'm hosting mutiple domains (virtual
> >> via mysql) so I cannot be sepecific to each one. Also I'm using
> >> postini with some but not all the domains.
> >>
> >> Thanks,
> >>
> >> Josh
> >>
> >>
> >> --
> >> This message has been scanned for viruses and
> >> dangerous content by Mychoice, and is
> >> believed to be clean.
> >>

--
GRATIS für alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

From: joe on 15 Jul 2010 15:59

On 07/15/2010 12:29 PM, Steve wrote:
> Or GROSS (the only greylisting application that I know working with a bloom filter (http://en.wikipedia.org/wiki/Bloom_filter)).
>
> http://code.google.com/p/gross/
>

Thanks for the link, what I see there is very interesting - I'll check
this out...

Joe

From: Henrik K on 15 Jul 2010 16:54

On Thu, Jul 15, 2010 at 09:02:52PM +0200, Steve wrote:
>
> -------- Original-Nachricht --------
> > Datum: Thu, 15 Jul 2010 19:37:48 +0200
> > Von: Ralf Hildebrandt <Ralf.Hildebrandt(a)charite.de>
> > An: postfix-users(a)postfix.org
> > Betreff: Re: Better spam filter for postfix
>
> > * Josh Cason <jocaso(a)mychoice.cc>:
> >
> > > As most of you guys know. I use mailscanner. I would like
> > > recomendations of what else to use. I prefer a all in one package
> > > like what mailscanner does. It also utilizes clamav and spamassion.
> >
> > So does amavisd-new
> >
>
> If you looking for something that is beyond just being better then I
> recommend CRM114 or DSPAM or OSBF-Lua. If you insist in having the AV
> included in the Anti-Spam tool then use something like DSPAM.

I'd consider those as "engines". You can run one or all of them if you
really want. MailScanner, Amavisd-new, Mimedefang and even SA (as a
framework) are some of the "glues" that might utilize them. Also ClamAV
isn't just an "AV" tool. It's a lot more of an Anti-Spam tool when used with
Sanesecurity signatures etc.

There are a million combinations of glues, engines and other general
anti-spam methods. You need to be very clear on your needs to get a
meaningful answer (and maybe not even then).

> I use all of the above mentioned and all of them are fast and accurate.
> DSPAM is the one that is the easiest to scale and DSPAM is the one using
> the lowest amount of memory (DSPAM alone uses on my setup less then 10MB
> of memory for hundreds of domains having thousands of users in total).
> From a algorithm viewpoint CRM114 is a insane tool. It offers you a lot of
> algorithms and is virtually expendable to anything you like (it includes
> it's own language).
>
> If you used SA in the past then any of the above will surprise you in
> terms of speed, memory consumption and accuracy.

Generally DSPAM etc require user interaction/learning. SA does not, since
it's a framework of rules and plugins and can autolearn Bayes if you want to
- or even do the same for DSPAM etc if you use them as SA plugins. Let's not
forget that DSPAM etc also require a database backend, which might require
lots of memory and/or disk, so it's not exactly "free" either. Accuracy
depends heavily on configuration of all the components and other voodoo.
There are no easy answers.

From: "Steve" on 15 Jul 2010 17:16

-------- Original-Nachricht --------
> Datum: Thu, 15 Jul 2010 23:54:22 +0300
> Von: Henrik K <hege(a)hege.li>
> An: postfix-users(a)postfix.org
> Betreff: Re: Better spam filter for postfix

> On Thu, Jul 15, 2010 at 09:02:52PM +0200, Steve wrote:
> >
> > -------- Original-Nachricht --------
> > > Datum: Thu, 15 Jul 2010 19:37:48 +0200
> > > Von: Ralf Hildebrandt <Ralf.Hildebrandt(a)charite.de>
> > > An: postfix-users(a)postfix.org
> > > Betreff: Re: Better spam filter for postfix
> >
> > > * Josh Cason <jocaso(a)mychoice.cc>:
> > >
> > > > As most of you guys know. I use mailscanner. I would like
> > > > recomendations of what else to use. I prefer a all in one package
> > > > like what mailscanner does. It also utilizes clamav and spamassion.
> > >
> > > So does amavisd-new
> > >
> >
> > If you looking for something that is beyond just being better then I
> > recommend CRM114 or DSPAM or OSBF-Lua. If you insist in having the AV
> > included in the Anti-Spam tool then use something like DSPAM.
>
> I'd consider those as "engines". You can run one or all of them if you
> really want. MailScanner, Amavisd-new, Mimedefang and even SA (as a
> framework) are some of the "glues" that might utilize them.
>
Well.... those so called "engines" can run on their own. They don't need to be wrapped inside any of the "glues" you mention. Especially not when those "glues" are memory hogs.

> Also ClamAV
> isn't just an "AV" tool. It's a lot more of an Anti-Spam tool when used
> with
> Sanesecurity signatures etc.
>
> There are a million combinations of glues, engines and other general
> anti-spam methods. You need to be very clear on your needs to get a
> meaningful answer (and maybe not even then).
>
> > I use all of the above mentioned and all of them are fast and accurate.
> > DSPAM is the one that is the easiest to scale and DSPAM is the one using
> > the lowest amount of memory (DSPAM alone uses on my setup less then 10MB
> > of memory for hundreds of domains having thousands of users in total).
> > From a algorithm viewpoint CRM114 is a insane tool. It offers you a lot
> of
> > algorithms and is virtually expendable to anything you like (it includes
> > it's own language).
> >
> > If you used SA in the past then any of the above will surprise you in
> > terms of speed, memory consumption and accuracy.
>
> Generally DSPAM etc require user interaction/learning.
>
So does CRM114 and OSBF-Lua. But you are wrong in thinking that they need an insane amount of training/learning.

> SA does not, since
> it's a framework of rules and plugins and can autolearn Bayes if you want
> to
> - or even do the same for DSPAM etc if you use them as SA plugins. Let's
> not
> forget that DSPAM etc also require a database backend,
>
You are WRONG. DSPAM does NOT require a database backend. I don't know where you have that from? DSPAM MIGHT use a database backend but can run well without one (using the Hash driver).

> which might require
> lots of memory and/or disk, so it's not exactly "free" either. Accuracy
> depends heavily on configuration of all the components and other voodoo.
>
What? Voodoo? Yeah right. There is less voodoo in CRM114, OSBF-Lua and DSPAM then in SA. I explain a user the following:
* you get mail and if it is wrongly classified by the Anti-Spam filter then you correct it and the filter will learn.
* the wrong classification is done based on YOUR prior classification you have feed to the Anti-Spam filter.
* if you feed wrong data to the Anti-Spam filter then the filter will make errors.
* the more you correct the higher the accuracy gets and you need less and less to correct errors.

That's easy to understand.

IMHO it is easier to explain then telling the user:
* there is an army of rule writers out there that is writing rules for SA where THEY are telling what is spam and what is ham.

And if the user asks me: what rules are that?
Then I would need to say that there are a gazillion of rules that I can not explain in detail without taking much of his time to go throw all the rules one by one.

Anyway...

For me the three mentioned products are all better then SA because they have a smaller memory footprint then SA and are way faster then SA and properly set up require less maintenance and are way more accurate then SA.

And regarding the training:
DSPAM and CRM114 offers features where you can pre-learn so that your users are having from day one already a high accuracy (generally above 95%) and if they re-classify the first bunch of errors then their accuracy jumps easy over 98.x%/99.x%. In DSPAM that kind of setup is accomplished with merged groups or classification groups or shared groups.
In CRM114 you can at run time allocate and merge as many CSS files (one pre-trained should be enough) as you like.

> There are no easy answers.
>
And this is generally the field where Anti-Spam tools that do not depend on pre-made rules are shining, because they are very adaptive.
--
GMX DSL: Internet-, Telefon- und Handy-Flat ab 19,99 EUR/mtl.
Bis zu 150 EUR Startguthaben inklusive! http://portal.gmx.net/de/go/dsl

From: Henrik K on 15 Jul 2010 19:09

On Thu, Jul 15, 2010 at 11:16:43PM +0200, Steve wrote:
> > >
> > > If you looking for something that is beyond just being better then I
> > > recommend CRM114 or DSPAM or OSBF-Lua. If you insist in having the AV
> > > included in the Anti-Spam tool then use something like DSPAM.
> >
> > I'd consider those as "engines". You can run one or all of them if you
> > really want. MailScanner, Amavisd-new, Mimedefang and even SA (as a
> > framework) are some of the "glues" that might utilize them.
> >
>
> Well.... those so called "engines" can run on their own. They don't need
> to be wrapped inside any of the "glues" you mention. Especially not when
> those "glues" are memory hogs.

Can you be more specific? Maybe you are addressing SA memory usage, which
might only matter on some cases. Servers have lots of memory these days, and
good MTA checks might reduce scanning needs greatly.

> > Generally DSPAM etc require user interaction/learning.
> >
> So does CRM114 and OSBF-Lua. But you are wrong in thinking that they need
> an insane amount of training/learning.

That's what I meant with "etc". I did use DSPAM exclusively for few months
in the past, but for my personal use I saw no benefits from it.

> > SA does not, since
> > it's a framework of rules and plugins and can autolearn Bayes if you want
> > to
> > - or even do the same for DSPAM etc if you use them as SA plugins. Let's
> > not
> > forget that DSPAM etc also require a database backend,
> >
>
> You are WRONG. DSPAM does NOT require a database backend. I don't know
> where you have that from? DSPAM MIGHT use a database backend but can run
> well without one (using the Hash driver).

So you don't consider the CSS Hash driver a "database backend"? It requires
disk, memory and CPU to store and retrieve tokens. Whatever..

> > which might require
> > lots of memory and/or disk, so it's not exactly "free" either. Accuracy
> > depends heavily on configuration of all the components and other voodoo.
> >
>
> What? Voodoo? Yeah right. There is less voodoo in CRM114, OSBF-Lua and DSPAM then in SA. I explain a user the following:
> * you get mail and if it is wrongly classified by the Anti-Spam filter then you correct it and the filter will learn.
> * the wrong classification is done based on YOUR prior classification you have feed to the Anti-Spam filter.
> * if you feed wrong data to the Anti-Spam filter then the filter will make errors.
> * the more you correct the higher the accuracy gets and you need less and less to correct errors.
>
> That's easy to understand.
>
>
> IMHO it is easier to explain then telling the user:
> * there is an army of rule writers out there that is writing rules for SA where THEY are telling what is spam and what is ham.
>
> And if the user asks me: what rules are that?
> Then I would need to say that there are a gazillion of rules that I can not explain in detail without taking much of his time to go throw all the rules one by one.
>
> Anyway...

So you have made your point. You prefer (or are required) to have user in
control.

I guess you don't use ANY other methods (blacklists etc) than users own
statistical input, since you might have to tell your users that "THEY"
though your mail was spam?

> For me the three mentioned products are all better then SA because they
> have a smaller memory footprint then SA and are way faster then SA and
> properly set up require less maintenance and are way more accurate then
> SA.

Good for you. Naturally resource usage is lower, the less stuff you do. One
has to balance needs against that.

But let's forget the accuracy bs, there are too many variables for such
generic claims to be made. You can achieve "happy users" with pretty much
any tool out there if used right.

I'm in a happy position to be able to reject/quarantine spam for 1000+ users
without ever bothering them with it, and very rarely get any questions about
mail. If I had to do it the ISP way, I might consider DSPAM, then again I
see nothing against using SA (or any other tool out there).

> And regarding the training: DSPAM and CRM114 offers features where you can
> pre-learn so that your users are having from day one already a high
> accuracy (generally above 95%) and if they re-classify the first bunch of
> errors then their accuracy jumps easy over 98.x%/99.x%. In DSPAM that kind
> of setup is accomplished with merged groups or classification groups or
> shared groups. In CRM114 you can at run time allocate and merge as many
> CSS files (one pre-trained should be enough) as you like

You make it sound like statistical filters are invincible against different
mail flows and pure user stupidity.

> > There are no easy answers.
> >
>
> And this is generally the field where Anti-Spam tools that do not depend
> on pre-made rules are shining, because they are very adaptive.

Right, like SA for example only depends on "pre-made" rules and doesn't have
any statistical or realtime capabilities..

I think continuing this is pointless and a bit off-topic.

First | Prev | Next | Last
Pages: 1 2 3 4 5
Prev: Simple Hack To Get $2000 To Your PayPal Account
Next: null client doc