So -- what did happen to Panix?

Thu Jan 26 19:39:59 UTC 2006

Steven, all,

On Wed, Jan 25, 2006 at 03:04:30PM -0500, Steven M. Bellovin wrote:
> 
> It's now been 2.5 business days since Panix was taken out.  Do we know 
> what the root cause was?  It's hard to engineer a solution until we 
> know what the problem was.

I keep hearing that Con Ed Comm was previously an upstream of of Panix
( http://www.renesys.com/blog/2006/01/coned_steals_the_net.shtml#comments )
and that this might have explained why Con Ed had Panix routes in
their radb as-27506-transit object.  But I checked our records
of routing data going back to jan 1, 2002, and see no evidence of
27506 and 2033 being adjacent to each other in any announcement from
any of our peers at any time since then.  So I can't really verify
that Panix was ever a Con Ed Comm customer.  Can anyone else clear
this up?  So far, it's not making sense.

The supposition was that all of the other affected ASes that are not
currently customers of Con Ed Comm were also previously customers.
Some appear to have been (Walrus Internet (AS7169), Advanced Digital
Internet (AS23011), and NYFIX (AS20282) for sure) but I haven't been
able to verify that all of them were.  

I know that this isn't really a "root cause" that Steven was asking
for, though.  The root cause is that filtering is imperfect and out of
date frequently. This case is particularly intersting and painful
because Verio is known for building good filters automatically.  In
this case, they did so based on out-of-date information,
unfortunately. This is particularly depressing because normally in
cases of leaks like this, the propagation is via some provider or peer
who doesn't filter at all.  In this case, one of the vectors was one
of the most responsible filterers on the net.  sigh. 

So in terms of engineering good solutions, the space is pretty
crowded.  One camp is of the "total solution" variety that involves
new hardware, new protocols, and a Public Key approach where
originations (or any announcements) are signed and verified.  This is
obviously a very good and complete approach to the problem but it's
also obviously seeing precious little adoption.  And in the mean time
we have nothing.

Another set of approaches has been to look at alternate methods of
building filters, taking into account more information about history
of routing announcements and dampening or refusing to accept novel,
questionable announcements for some fixed, short amount of time.  Josh
Karlin's paper suggests that as does some of the stuff that Tom
Scholl, Jim Deleskie and I presented at the last nanog. All of this
has the disadvantage of being a partial solution, the advantage of
being implementable easily and in stages without a network forklift or
a protocol upgrade, but the further disadvantage of being nowhere near
fully baked. 

Clearly more, smarter people need to keep searching for good solutions
to this set of problems.  Extra credit for solutions that can be
implemented by individual autonomous systems without hardware upgrades
or major protocol changes, but that may not be possible.

t.

p.s.:  wrt comments made previously that imply that moving parts of
routing control off of the routers is "Bell-like" or "bell-headed":
although the comments are silly and made somewhat in jest, they're
obviously not true.  anyone who builds prefix filters or access lists
off of routers is already generating policy somewhere other than the
router.  using additional history or smarts to do that and uploading
prefix filters more often doesn't change that existing architecture or
make the network somehow "bell-like".  it might not work well enough
to solve the problem, but that's another, interesting objection.

-- 
_____________________________________________________________________
todd underwood
chief of operations & security 
renesys - internet intelligence
todd at renesys.com   http://www.renesys.com/blog