Notice: Fradulent RIPE ASNs
rsk at gsp.org
Thu Jan 17 10:33:34 UTC 2013
On Wed, Jan 16, 2013 at 11:39:14AM -0500, William Herrin wrote:
> 1. Has SPAMHAUS attempted to feed relevant portions of their knowledge
> into ARIN's reporting system for fraudulent registrations and,
I don't know the answer to that.
> 2. Understanding that ARIN can only deal with fraudulent
> registrations, not any other kind of bad-actor behavior, are there
> improvements to ARIN's process which would help SPAMHAUS and similar
> organizations feed ARIN actionable knowledge?
All ARIN (public) data should be immediately downloadable in bulk by anyone
who wishes to access it. No registration, no limits, no nothing. As I
pointed out here a couple of weeks ago (see below), query rate-limiting
measures such as RIPE currently employs are not only pointless but
counterproductive: the bad guys already have (or can have) the data any
time they wish, but the good guys can't. I suggest a daily rsync'able
snapshot of the whole enchilada in whatever form(s) is/are appropriate:
text, XML, tarball, etc.
Of course I was responding to something from RIPE, but this applies
everywhere. It's 2013. The bad guys have had the means to easily
bypass stuff like this for about a decade, if not longer. It's not only
silly to keep pretending they don't, but it's limiting: some of the best
techniques we have for spotting not only fraudulent registrations, but
other patterns of abuse, work best when given as much data as possible.
(It's really quite impressive what you can find with "grep", if you
have enough data in the right form.)
(Incidentally, the same thing is true of all domain registration data.
The namespace, like network space, is a public resource, therefore
anyone using any of it must be publicly accountable.)
Here's what I said at the time, generalize/modify appropriately:
> Subject: Re: RIPE Database Proxy Service Issues
> On Wed, Jan 02, 2013 at 05:00:14PM +0100, Axel Pawlik wrote:
> > To prevent the automatic harvesting of personal information (real
> > names, email addresses, phone numbers) from the RIPE Database, there
> > are PERSON and ROLE object query limits defined in the RIPE Database
> > Acceptable Use Policy. This is set at 1,000 PERSON or ROLE objects
> > per IP address per day. Queries that result in more than 1,000
> > objects with personal data being returned result in that IP address
> > being blocked from carrying out queries for that day.
> 1. The technical measures you've outlined will not prevent, and have
> not prevented, anyone from automatically harvesting the entire thing.
> Anyone who owns or rents, for example, a 2M-member botnet, could easily
> retrieve the entire database using 1 query per IP address, spread out
> over a day/week/month/whatever. (Obviously more sophisticated approaches
> immediately suggest themselves.)
> Of course a simpler approach might be to buy a copy from someone who
> already has.
> I'm not picking on you, particularly: all WHOIS operators need to stop
> pretending that they can protect their public databases via rate-limiting.
> They can't. The only thing that they're doing is preventing NON-abusers
> from acquiring and using bulk data.
> 2. This presumes that the database is actually a target for abusers.
> I'm sure for some it is. But as a source, for example, of email
> addresses, it's a poor one: the number of addresses per thousand records
> is relatively small and those addresses tend to belong to people with
> clue, making them rather suboptimal choices for spamming/phishing/etc.
> Far richer targets are available on a daily basis simply by following
> the dataloss mailing list et.al. and observing what's been posted on
> pastebin or equivalent. These not only include many more email addresses,
> but often names, passwords (encrypted or not), and other personal details.
> And once again, the simpler approach of purchasing data is available.
> 3. Of course answering all those queries no doubt imposes significant
> load. Happily, one of the problems that we seem to have pretty much
> figured out how to solve is "serving up many copies of static
> content" because we have tools like web servers and rsync.
> So let me suggest that one way to make this much easier on yourselves is
> to export a (timestamped) static snapshot of the entire database once
> a day, and let the rest of the Internet mirror the hell out of it.
> Spreads out the load, drops the pretense that rate-limiting
> accomplishes anything useful, makes all the data available to everyone
> equally, and as long as everyone is aware that it's a snapshot and not
> a real-time answer, would probably suffice for most uses. (It would
> also come in handy during network events which render your service
> unreachable/unusable in whole or part, e.g., from certain parts of
> the world. Slightly-stale data is way better than no data.)
More information about the NANOG