Quakecon: Network Operations Center tour

Mon Aug 3 08:35:50 UTC 2015

On 02.08.2015 23:36, Josh Hoppes wrote:
> We haven't tackled IPv6 yet since it adds complexity that our primary
> focus doesn't significantly benefit from yet since most games just
> don't support it. Our current table switches don't have an RA guard,
> and will probably require replacement to get ones that are capable.

The lack of RA-guard/DHCPv6-guard can still bite you. A client can still 
send rogue RAs and set up a rogue DNS-server and start hijacking traffic 
as AAAA is preferred over A records by most operating systems these 
days. IPv6 first-hop security is really underrated these days and not 
providing the clients with IPv6 does not exclude IPv6 as a potential 
attack vector.

> We also re-designed the LAN back in 2011 to break up the giant single
> broadcast domain down to a subnet per table switch. This has
> definitely gotten us some flack from the BYOC since it breaks their
> LAN browsers, but we thought a stable network was more important with
> how much games have become dependent on stable Internet connectivity.
> Still trying to find a good way to provide a middle ground for
> attendees on that one, but I'm sure everyone here would understand how
> insane a single broadcast domain with 2000+ hosts that aren't under
> your control is. We have tried to focus on latency on the LAN, however
> when so many games are no longer LAN oriented Internet connectivity
> became a dominant issue.

At The Gathering we solved this by using ip helper-address for specific 
game ports and a broadcast forwarder daemon (which has been made 
publicly available). It sounds really ugly, but it works pretty good, 
just make sure to rate-limit the broadcast as it can be pretty ugly in 
the case of a potential loop/broadcast-storm.

> Some traffic is routed out a separate lower capacity connection to
> keep saturation issues from impacting it during the event.
>
> Squid and nginx do help with caching, and thankfully Steam migrated to
> a http distribution method and allows for easy caching. Some other
> services make it more difficult, but we try our best. Before Steam
> changed to http distribution there were a few years they helped in
> providing a local mirror but that seems to have been discontinued with
> the migration to http. The cache pushed a little over 4Gbps of traffic
> at peak at the event.
>
> The core IT team which handles the network (L2 and above) is about 9
> volunteers. The physical infrastructure is our IP & D team, which gets
> a huge team of volunteers put together in order to get that 13 miles
> of cable ready between Monday and Wednesday. The event is very
> volunteer driven, like many LAN parties across the planet. We try to
> reuse cable from year to year, including loading up the table runs
> onto a pallet to be used in making new cables out of in future years.
>

Thanks for the write-up, it's always cool to read how others in the 
"LAN-party scene" does things! :)

-- 
Harald