Time to revise RFC 1771

Clayton Fiske clay at bloomcounty.org
Tue Jun 26 20:47:37 UTC 2001


On Tue, Jun 26, 2001 at 04:27:49PM -0400, Dave Israel wrote:
> 
> This ignores three basic facts:
> 
> 1) Networks tend to be homogenous in platform.
> 2) Platforms tend to accept their own implementation quirks
> 3) Networks peer at borders
> 
> Therefore, under the "drop the session rule," my bad announcement
> gets to all my borders fine, and all my external peers who are not
> running forgiving/compatable implementations drop their connections
> to me and all my traffic to/from them hits the floor.

In this case, vendor C's implementation was neither forgiving nor
compatible. It still dropped the peer(s) in question. It just had
the much more harmful quirk that it forwarded the bad route on to
its peers before doing so. In this case, a homogenous network would
not only lose its border sessions, it would lose all internal ones
through which the route was advertised.

> One CRC error does not make PPP drop.  Why make one route cause
> a catastrophic loss of connectivity?  Report the bad route,
> drop it, and move on; let layer 8 resolve it.

Because, arguably, we don't know that it's just one route. We just
know that one route set off the alarm. Do you feel safe assuming that
whatever bug caused one corrupted route left all the other routes
alone?

Plus, a CRC error can occur between two valid, compliant, bug-free
implementations. A bad route, by definition, can't. We're not talking
about external faults here, but broken implementations. When one side
of a protocol session simply breaks the rules, I don't think it's
reasonable to say that the other side needs to be "fixed" to accept
that breakage. Fix the broken side.

The reason this has got everyone's attention is because of the unique
way in which the breakage occurred. If all implementations were changed
to drop the single bad route and keep the sessions intact, the damage
would not have been what it was. If all implementations followed the
current specs and dropped the session with the router which first
originated the bad route, the damage would not have been what it was.
To say that one way causes massive damage and the other doesn't is
inaccurate. The damage was caused by the implementation in question
doing something resembling one but with harmful behavior thrown in.

-c




More information about the NANOG mailing list