DOs and DONTs for small ISP

Tue Jun 4 12:56:08 UTC 2019

On Mon, Jun 3, 2019 at 11:34 PM Brandon Martin <lists.nanog at monmotha.net> wrote:
>
> On 6/3/19 9:56 AM, Jon Lewis wrote:
> > 3) Don't advertise one transit provider's routes to another.  Each should
> >     be filtering your routes, but you never know.  Come up with, and use
> >     BGP communities to control route propagation.  As you grow, it sucks
> >     having to update prefix-list filters in multiple places every time
> >     something changes...like a new customer with their own IPs.
>
> To reiterate all this, FILTER EVERYTHING.
>
> To start with, explicitly specify in a route-map or similar everything
> you want to advertise.  I usually create a separate route-map for each
> transit/peer and include what I want to advertise via prefix lists (for
> my IP space) and/or communities (for downstream BGP-speaking customers
> if anticipated).

I think a related *principle* is: "Build everything as though you are
expecting to scale."

This doesn't mean "spend lots of money to buy huge
[routers|servers|commercial software|<etc>], but rather "when you plan
your addressing structure and routing policies and monitoring and
device config generation and... keep the in mind the question "If this
suddenly takes off, and I hire N more people to run this, can I
explain to them how it works? Do I have documentation I can point them
at or is it stuck in my head / on the devices? If I need to add
another M customers in the next month, can I do that easily?".

This is related to the FILTER EVERYTHING -- when you turn up a new
customer / peer / transit / whatever, you shouldn't be sitting around
trying to figure out how you will write their route-map /
policy-options -- this leads to weird one-offs, and quick hacks.
Instead you should have policies already largely designed and simply
plug in their prefixes (or, better yet, use bgpq3 or similar to build
and populate these). Obviously there will be some cases where a new
connection does require some special handling, but that *should* just
be a plugin/chain in an existing policy-statement. Related to this is
how you end up naming things -- I recently found 9 variants of
firewall-filters which basically do:

filter ACCEPT {
   term ACCEPT {
    then accept;
  }
}
named things like: ACCEPT, ACEPT, Accept, Allow, Permit_all,
AcceptAll, dontdrop [0].

Obviously, there is a tension in the "design for scale" - while it
would be great to design a complete automation system so that
everything from installing a new customer to a new sites is simply
typing 'make <thing>' and having everything pull from a database, at
some point you will need to actually build a network, or you'll never
have customers :-) Just keep in mind that "Am I building myself into a
corner here?". E.g it only takes 10 or 15 minutes to install something
like NetBox to keep track of addresses (and prefixes and racks and
connections and ...) -- stuffing this in a spreadsheet might save you
a few minutes *now*, but will this scale? Can $new_person easily
figure it out?

W
[0]: My personal favorite is:
filter Accept_All {
    term Accept {
        then {
            count dropped;
            reject;
        }
    }
    term filter_<customer> {
        from {
            prefix-list {
                <customer>;
           }
        }
        then accept;
    }
    term NEXT {
        then log;
    }
}

Presumably this all made sense to <name_removed_to_protect_inoccent>
when they stuck it in at 3AM to deal with some crazy issue, but...

>
> When you turn on the session, check what you're squawking AND WHAT
> YOU'RE FILTERING.  You shouldn't be filtering anything you don't expect.
>   Belt + suspenders.
>
> The same goes for anything you accept.  Obviously for a blended full
> transit BGP edge router, you're probably going to accept almost
> everything.  But if you only want default + on-net, try to filter using
> communities from the peer, etc.  Again, right when you turn on the
> session, "sh ip bgp ... filtered" of whatever's equivalent on your
> platform.  If you're filtering something you don't expect to be
> receiving at all, figure out where the misunderstanding or
> misconfiguration lies.
>
> And of course it goes without saying that, if you've got BGP speaking
> customers, you filter the heck out of them.  Use ROAs and/or RPKI if you
> can to automatically generate filter lists.  Encourage your upstreams to
> do the same if they're filtering you (and they probably are, or at least
> should be, if you're new).  Remember that you are responsible for every
> route you advertise, at the end of the day, even if you only advertised
> it because a downstream network made a boo-boo and you didn't filter it.
>
> Filters are useful on your IGP, too, but there's so many ways to set all
> that up that it's a bit more difficult to come up with nearly universal
> best practices.  Generally speaking, be careful with redistribution,
> never distribute BGP into IGP or vice versa unless you have a really,
> really good reason to, and consider filters between both IGP
> areas/regions or protocols (e.g. RIP coming into OSPF) as well as on
> redistributions of static/connected to prevent simple typos on a static
> route or interface configuration from taking down more than just local
> stuff.
>
> It's way, way easier to remove or relax filters later if they prove more
> of an operational hazard than asset than it is to add or tighten them if
> they prove insufficient.
> --
> Brandon Martin

-- 
I don't think the execution is relevant when it was obviously a bad
idea in the first place.
This is like putting rabid weasels in your pants, and later expressing
regret at having chosen those particular rabid weasels and that pair
of pants.
   ---maf