The End-To-End Internet (was Re: Blocking MX query)

Sat Sep 8 11:45:45 UTC 2012

On Wed, Sep 05, 2012 at 02:15:07PM -0700, Joe St Sauver wrote:
> 2) The Spamhaus CBL tracks the level of bot spam currently seen,
> including breaking out statistics by a number of factors.
> 
> 3) Currently, the US, where port 25 filtering is routinely deployed by
> most large ISPs, is ranked 158th among countries when you consider botted
> users on a per capita basis: http://cbl.abuseat.org/countrypercapita.html
> 
> 4) While that's not perfect (after all, there are still at least 133,811 
> listings for the US), on a PER-CAPITA basis, it's not bad -- that's just 
> ~0.055% of US Internet users that are infected, relative to some countries 
> where the rate of detected infection (based on spam emission) may be 4 to
> 5% or more.

I don't believe those numbers say that last.  I *wish* those numbers said
that, but I don't think they do.  Here's why.

A. "bot spam seen" (by whatever number of sensors are deployed) is
conditional on bot spam making it out of its local network and onto
some other network where is sensor exists.  Clearly, port 25 blocking
will dramatically curtail that.  Thus, spam is still being generated
by those systems: it's just not getting anywhere.

B. Spam is not the only form of abuse generated by bots.  Some participate
in DDoS attacks, some host illicit web sites, some harvest addresses,
the list is endless.  Any sensor which only looks for spam arriving
via SMTP on port 25 will miss all those.

C. Some bots engage in secondary support activities (e.g., hosting
DNS for spammer domains) which is not intrinsicly abusive, but is
certainly abusive in context.  Most of this will be missed by most
of everything and everyone.

D. Some bots do nothing -- that is, nothing overtly recognizable
by external sensors of any kind at any location.  They're either
harvesting local data or perhaps they're simply being held in reserve,
a practice our adversaries adopted quite early on.

Thus we can't use anybody's numbers for observed bot-generated spam
to estimate infection rates -- other than to set a lower bound on them.
The upper bound can be, and like likely is, MUCH higher.  Doubly so
because there is abolutely no reason of any kind to think that infection
rates of US-based hosts significantly differ from global norms.

More broadly, the per-nation rates are interesting but probably
unimportant: this is a global problem, so even if country X solved
it (for a useful value of "solved") it would matter little.  I think
at this point any estimate of bot population under 200M should be
laughed out of the room, and that (just as it has for a decade)
it continues to  monotonically increase.

---rsk