DNS problems to RoadRunner - tcp vs udp

Scott C. McGrath mcgrath at fas.harvard.edu
Mon Jun 16 16:51:54 UTC 2008


Thanks for the helpful suggestions.

For what it's worth we use Cisco's CNR as we operate a MAC registration 
system which controls access to our network.   We allow customers to 
select hostnames which are pushed into DDNS when the the system acquires 
a lease.    CNR has internal limits (user configurable) which control 
the TCP state machine and these are easy to overwhelm as once you hit 
the high limit
the server process stops accepting new connection requests for any 
reason until the connections go below the max limit once again.   We 
have been in constant contact with the development group on defending 
these machines from DDoS activity.

UDP is somewhat easier due to our network structure than TCP to rate 
limit and we do operate microflow policers to limit UDP activity from 
any given host.

We once used BIND but bind could not handle the DDNS updates in a 
reasonable fashion as we have many short lived connections as students 
access the wireless network between classes
hence the move to CNR which handles DDNS effectively but does not like 
TCP based attacks     Unlike MIT over the river Harvard only has 2 Class 
B's available and we have many more registered clients than we have IP 
space for and a community which requires fixed hostnames for academic 
reasons and since we cannot assign static IP assignments except to well 
known and fixed services this becomes problematic hence DDNS which as 
many have pointed out here is painful from a operational standpoint but 
in our environment it is a lifesaver.

Unfortunately we have needed to insert some controlled breakage into the 
network to keep the services our customers require alive as TCP SYN 
attacks are unfortunately still effective in this
day and age we have tried many things our latest foray into TCP control 
is creating a Snort infrastructure which is sufficient to monitor all 
flows ingressing and egressing our network and from there based on 
analysis of the data applying rules to limit traffic in real time from 
ill behaved TCP hosts as our long term goal is not to operate a 
corporate network locked into stupid mode with no understanding of 
protocol needs

- Scott

Nathan Ward wrote:
> On 15/06/2008, at 9:18 AM, Scott McGrath wrote:
>> Yes - we are blocking TCP too many problems with drone armies and we 
>> started about a year ago when our DNS servers became unresponsive for 
>> no apparent reason.   Investigation showed TCP flows of hundreds of 
>> megabits/sec and connection table overflows from tens of thousands of 
>> bots all trying to simultaneously do zone transfers and failing tried 
>> active denial systems and shunning with limited effectiveness.
>> We are well aware of the host based mechanisms to control zone 
>> information,  Trouble is with TCP if you can open the connection you 
>> can DoS so we don't allow the connection to be opened and this is 
>> enforced at the network level where we can drop at wire speed.     
>> Open to better ideas but if you look at the domain in my email 
>> address you will see we are a target for hostile activity just so 
>> someone can 'make their bones'.
> There's really two problems here - one is packet/bit rate causing 
> problems for your network, that's not necessarily an end system thing. 
> Not really DNS specific, and blocking 53/TCP doesn't really help here 
> as people could just send 53/UDP your way and get the same effect.
> Connection table overflowing is a bit of a different issue, obvious 
> way to overcome that is to whack a load balancer in there to share the 
> load around. It's not immediately obvious to me why your connection 
> table would be filling up - what state were connections stuck in?
> Anyway, one thought that comes to me would be to split off UDP and TCP 
> services to different servers - if some TCP attack kills your TCP DNS 
> server you:
> a) don't have to worry about UDP services failing.
> b) can turn it off for the duration of the attack, and are no worse 
> off than you are right now, then turn it back on when you see the high 
> volume of SYN messages disappear.
> c) as TCP DNS service recovery isn't super time critical (I'm assuming 
> this, because you're not running it at all right now) you have time to 
> look at the anatomy of the attack and figure out how to filter it more 
> precisely if possible, instead of simply dropping all TCP.
> Obviously, you'd want to make sure TCP from your other name servers 
> always goes to the UDP one, etc. etc.
> -- 
> Nathan Ward

More information about the NANOG mailing list