TWC (AS11351) blocking all NTP?

Christopher Morrow morrowc.lists at gmail.com
Tue Feb 4 02:33:51 UTC 2014


-larry directly since I'm sure he's either tired of this, or already
reading it via the nanog subscription.

On Mon, Feb 3, 2014 at 7:54 PM, Peter Phaal <peter.phaal at gmail.com> wrote:
> On Mon, Feb 3, 2014 at 2:58 PM, Christopher Morrow
> <morrowc.lists at gmail.com> wrote:
>> wait, so the whole of the thread is about stopping participants in the
>> attack, and you're suggesting that removing/changing end-system
>> switch/routing gear and doing something more complex than:
>>   deny udp any 123 any
>>   deny udp any 123 any 123
>>   permit ip any any
>>
>> is a good plan?
>>
>> I'd direct you at:
>>   <https://www.nanog.org/resources/tutorials>
>>
>> and particularly at:
>>  "Tutorial: ISP Security - Real World Techniques II"
>>  <https://www.nanog.org/meetings/nanog23/presentations/greene.pdf>
>
> Thanks for the links. Many SDN solutions can be replicated using

you're sort of a broken record on this bit ... I don't think folk are
(me in particular) knocking sdn things, in general. In the specific
though:
  1) you missed the point originally, stop marketing your blog pls.
  2) you missed the point(s) about availability and realistic
deployment of solutions in the near term

> manual processes (or are ways of automating currently manual
> processes). Programmatic APIs allows the speed and accuracy of the
> response to be increased and the solution to be delivered at scale and
> at lower cost.

and all of these require very strict and very careful deployment of
oss measures to watch over current state and intended state. They
require also very careful training and troubleshooting steps for the
ops folk running the systems.  None of this is deployable 'tomorrow'
(in under 24hrs) safely, and most likely it'll be a bit more time
until there is ubiquitous deployment of sdn-like functionality in
larger scale networks.

not that I'm not a fan, and not that I don't like me some automation,
but.. having seen automation go very wrong (l3's acl spider... crushes
l3..., flowspec 'whoopsie' at cloudflare and TWTC... there are lots of
other examples).

>> it's probably not a good plan to forklift your edge, for dos targets
>> where all you really need is a 3 line acl.
>
> For many networks it doesn't need to be forklift upgrade - vendors are
> adding programmatic APIs to their existing products (OpenFlow, Arista
> eAPI, NETCONF, ALU Web Services ...) - so a firmware upgrade may be

arista is deployed in which large scale networks with api/sdn
functionality ? they're a great bunch of folks, they make some nice
gear, it's still getting baked though, and it's not displacing (today)
existing gear that's still being depreciated. for anything to be
workable in the near-term, the above examples just aren't going to
work. note my many references to "5-7 yrs when deprecation cycles and
next-replacement happens"

> all that is required.
>
> I do think that there are operational advantages to using protocols
> like OpenFlow, I2RS, BGP FlowSpec for these soft controls since they
> allow the configuration to remain relatively static and they avoid
> problems of split control (for example, and operator makes a config
> change and saves, locking in a temporary control from the SDN system).

automation, with protections, safety checks, assurances that the
process won't break things in odd failure modes.. not to mention
bug^H^H^Hfeature issues with gear, we're still a bit from large scale
deployment.

> I would argue that the more specific the ACL can be the less
> collateral damage. Built-in measurement allows for a more targeted
> response.

sure, I think roland and I at least have been saying the same thing.

>>> Good point - the proposed solution is most effective for protecting
>>> customers that are targeted by DDoS attacks. While trying to prevent
>>
>> Oh, so the 3 line acl is not an option? or (for a lot of customers a
>> fine answer) null route? Some things have changed in the world of dos
>> mitigation, but a bunch of the basics still apply. I do know that in
>> the unfortunate event that your network is the transit or terminus of
>> a dos attack at high volume you want to do the least configuration
>> that'll satisfy the 2 parties involved (you and your customer)...
>> doing a bunch of hardware replacement and/or sdn things when you can
>> get the job done with some acls or routing changes is really going to
>> be risky.
>
> I think an automatic system using a programmatic API to install as
> narrowly scoped a filter as possible is the most conservative and
> least risky option. Manual processes are error prone, slow, and blunt
> instruments like a null route can cause collateral damage to services.

folk say this, but the customer very often explicitly asks for null
routes. The thing being targetted is very often not 'revenue
generating ecommerce site', and for providers where the default answer
is 'everything is a null route', their customers ought to find a
provider that thinks differently.

>>>>> Typical networks probably only see a few DDoS attacks an hour at the
>>>>> most, so pushing a few rules an hour to mitigate them should have
>>>>> little impact on the switch control plane.
>>>>
>>>> based on what math did you get 'few per hour?' As an endpoint (focal
>>>> point) or as a contributor? The problem that started this discussion
>>>> was being a contributor...which I bet happens a lot more often than
>>>> /few an hour/.
>>>
>>> I am sorry, I should have been clearer, the SDN solution I was
>>> describing is aimed at protecting the target's links, rather than
>>> mitigating the botnet and amplification layers.
>>
>> and i'd say that today sdn is out of reach for most deployments, and
>> that the simplest answer is already available.
>>
>>> The number of attacks was from the perspective of DDoS targets and
>>> their service providers.  If you are considering each participant in
>>> the attack the number goes up considerably.
>>
>> I bet roland has some good round-numbers on number of dos attacks per
>> day... I bet it's higher than a few per hour globally, for the ones
>> that get noticed.
>
> The "few per hour" number isn't a global statistic. This is the number
> that a large hosting data center might experience. The global number

I wonder how many rackspace, softlayer, amazon-aws, xs4all, hetzner,
etc experience per hour. in any case, 'often' is probably close
enough.

> is much larger, but not very relevant to a specific provider looking
> to size a mitigation solution.
>
>> note that the focus of the original thread was on the contributors. I
>> think the target part of the problem has been solved since before the
>> slides in the pdf link at the top...
>
> Do most service providers allow their customers to control ACLs in the
> upstream routers? Do they automatically monitor traffic and insert the

nope, and I don't necessarily think that changes with SDN... letting
your customer traffic-engineer is ... dangerous. it tosses capacity
planning concerns out the window :(

There are several providers, however, that let their customers
initiate smart/intelligent mitigation solutions though. I know of 3
that let the customer trigger based on BGP community. A customer can
choose how they want to 'detect' and then simply bgp-update for
mitigation... I bet there are folk that don't own networks that
provide this service as well... I'm sure roland has some work stories
he's presented on about this very thing.

> filters themselves when there is an attack? I don't believe so - while

some providers do, based upon customer demand for the service. it's
not really that hard, though it is a cost for the provider so that's
shared with the customers using the solution(s).

> the slides describe a solution, automation is needed to make available
> at large scale.

automation isn't precluded from solution space in the slides, note
that they were presented and created in ~2002... so the state of the
art has changed a bit since then, but the methodology and practices
from 2002 can be applied fairly directly today.

>> you're getting pretty complicated for the target side:
>>   ip access-list 150 permit ip any any log
>>
>> (note this is basically taken verbatim from the slides)
>>
>> view logs, see the overwhelming majority are to hostX port Y proto
>> Z... filter, done.
>> you can do that in about 5 mins time, quicker if you care to rush a bit.
>
> An automated system can perform the analysis and apply the filter in a
> second with no human intervention. What if you have to manage
> thousands of customer links?

been there, done that... got several tshirts. it's honestly not that bad.

>>> This brings up an interesting point use case for an OpenFlow capable
>>> switch - replicating sFlow, NetFlow, IPFIX, Syslog, SNMP traps etc.
>>> Many top of rack switches can also forward the traffic through a
>>> GRE/VxLAN tunnel as well.
>>
>> yes, more complexity seems like a great plan... in the words of
>> someone else: "I encourage my competitors to do this"
>
> Using the existing switches to replicate and tap production traffic is
> less complex and more scalable than alternatives. You may find the
> following use case interesting:
>
> http://blog.sflow.com/2013/04/sdn-packet-broker.html
>
>> I think roland's other point that not very many people actually even
>> use sflow is not to be taken lightly here either.
>
> It doesn't have to be sFlow - the sFlow solution was provided as a
> concrete example since that is the technology I am most familiar with.

and which, according to a credible source, is not deployed by and
large by service providers. certainly in some IDC situations sflow is
interesting, but it's not there according to someone who I believe is
in a position to know, for isp situations.

leaving it out though, some signal of 'traffic looks like' is
available if deployed. not everyone does...some don't because 'meh!'
some don't because 'not in featureset bought' some don't because
'<other silly reason>'. folk that don't have it generally can't just
crank it up 'now' though.

> However, sFlow, IPFIX, NetFlow, jFlow etc. combined with analytics and
> a programmatic control API allows DDoS mitigation to be automated. I

right, arbor sells this, as one example. (there are others of course)
there are several large US isp's that use that solution (or an
offspring of that) today. it's not quite sdn, but it is automated and
relatively fire/forget.

-chris



More information about the NANOG mailing list