BGP unnumbered examples from data center network using RFC 5549 et al. [was: Re: RFC 5549 - IPv4 Routes with IPv6 next-hop - Does it really exists?]

Mark Tinka mark.tinka at seacom.com
Thu Jul 30 12:56:57 UTC 2020



On 30/Jul/20 12:00, Simon Leinen wrote:

> As Nick mentions, the hostnames are from the BGP hostname extension.
>
> I should have noticed that, but we use "BGP unnumbered"[1][2], which
> uses RAs to discover the peer's IPv6 link-local address, and then builds
> an IPv6 BGP session (that uses RFC 5549 to transfer IPv4 NLRIs as well).
>
> Here are some excerpts of the configuration on such a leaf router.
>
> General BGP boilerplate:
>
> ------------------------------
> router bgp 65111
>  bgp router-id 10.1.1.46
>  bgp bestpath as-path multipath-relax
>  bgp bestpath compare-routerid
> !
>  address-family ipv4 unicast
>   network 10.1.1.46/32
>   redistribute connected
>   redistribute static
>  exit-address-family
>  !
>  address-family ipv6 unicast
>   network 2001:db8:1234:101::46/128
>   redistribute connected
>   redistribute static
>  exit-address-family
> ------------------------------
>
> Leaf switch <-> server connection: (we use a 802.1q tagged subinterface
> for the BGP peering and L3 server traffic; the untagged interface is
> used only for netbooting the servers when (re)installing the OS.  Here,
> servers just get IPv4+IPv6 default routes, and each server will only
> announce a single IPv4+IPv6 (loopback) address, i.e. the leaf/server
> links are also "unnumbered".  Very simple redundant setup without any
> LACP/MLAG protocols... it's all just BGP+IPv6 ND.  You can basically
> connect any server to any switch port and things will "just work"
> without special inter-switch links etc.)
>
> ------------------------------
> interface swp1s0
>  description s0001.s1.scloud.switch.ch p8p1
> !
> interface swp1s0.3
>  description s0001.s1.scloud.switch.ch p8p1
>  ipv6 nd ra-interval 3
>  no ipv6 nd suppress-ra
> !
> [...]
> router bgp 65111
>  neighbor servers peer-group
>  neighbor servers remote-as external
>  neighbor servers capability extended-nexthop
>  neighbor swp1s0.3 interface peer-group servers
>  !
>  address-family ipv4 unicast
>   neighbor servers default-originate
>   neighbor servers soft-reconfiguration inbound
>   neighbor servers prefix-list DEFAULTV4-PERMIT out
>  exit-address-family
>  !
>  address-family ipv6 unicast
>   neighbor servers activate
>   neighbor servers default-originate
>   neighbor servers soft-reconfiguration inbound
>   neighbor servers prefix-list DEFAULTV6-PERMIT out
>  exit-address-family
> !
> ip prefix-list DEFAULT-PERMIT permit 0.0.0.0/0
> !
> ipv6 prefix-list DEFAULTV6-PERMIT permit ::/0
> ------------------------------
>
> Leaf <-> spine:
>
> ------------------------------
> interface swp16
>  description sw-o port 22
>  ipv6 nd ra-interval 3
>  no ipv6 nd suppress-ra
> !
> [...]
> router bgp 65111
>  neighbor fabric peer-group
>  neighbor fabric remote-as external
>  neighbor fabric capability extended-nexthop
>  neighbor swp16 interface peer-group fabric
>  !
>  address-family ipv4 unicast
>   neighbor fabric soft-reconfiguration inbound
>  !
>  address-family ipv6 unicast
>   neighbor fabric activate
>   neighbor fabric soft-reconfiguration inbound
> ------------------------------
>
> Note the "remote-as external" - this will accept any AS other than the
> router's own AS.  AS numbering in this DC setup is a bit weird if you're
> used to BGP... each leaf switch has its own AS, all spine switches
> should have the same AS number (for reasons...), and all servers have
> the same AS because who cares.  (We are talking about three disjoint
> sets of AS numbers for leaves/spines/servers though.)

Interesting.

Data centre bits are, interesting :-).

Thanks for sharing.

Mark.



More information about the NANOG mailing list