email scannering / filtering

Fri Dec 14 17:36:28 UTC 2018

On Fri, Dec 14, 2018 at 06:30:08AM -0500, David Funderburk wrote:
> What open source email filtering system is working well for you? 

I've been studying email abuse for a very long time, and am writing
a book about defending against it with open-source tools.

One of the things that I've learned over those decades is that while
some measures make sense for everyone, one size does not fit all, and
that it's critical to understand the mail stream that's being presented
before trying to design and build systems to deal with it.  Everyone's
legitimate email looks different.  Everyone's abusive email looks different.
It's not possible to figure out how to cope with these things until
you measure them.

Nor is it possible until you understand the operational requirements,
which again, are different for everyone.  Joe's Donuts in Dubuque
probably isn't going to be receiving messages at its "orders" address
from Peru or Pakistan, for example, so any incoming traffic like that
is almost certainly misdirected (at best) or abusive.  On the other
hand, Michigan State University will probably receive legitimate
traffic from all the world, including Peru and Pakistan.

Unfortunately, lots of people skip these two steps -- especially the
first one -- because they perceive them as onerous and unnecessary.
They thus hamstring their own efforts.

One of the other things I've learned is that there's a correct order
in which to apply defensive measures, so that the probability of FP
and FN (false positive and false negative) are both simultaneously
minimized, so that each successive measure has less work to do than
the one before, and so that those measures which consume the least
resources are deployed up front.  (For example: using the DROP list
in a perimeter router, firewall or even in the MTA's configuration
is a highly efficient/low-cost/low-resource measure that should be
done before doing other things.  This is, by the way, one of the
measures that make sense for everyone, see above.)

So while I could answer your question by telling you what I use,
that doesn't mean that it would work for you.  It *might*, and
after a fashion, it probably would -- but it's highly unlikely
that it's anything close to optimal for your environment.  There's
a fair amount of homework that needs to be done to figure that out.

One more thing.  There are a number of things that some people do
in their email systems which are worst practices -- things that
exacerbate the problem.  For example, "quarantines" or "spam folders"
are a profoundly horrible idea that should never be deployed.
(Ask RSA how that's working out for them.)  Avoid these.

---rsk