iPhone and Network Disruptions ...
Warren Kumari
warren at kumari.net
Wed Jul 25 20:16:47 UTC 2007
On Jul 24, 2007, at 5:34 PM, Iljitsch van Beijnum wrote:
>
> On 24-jul-2007, at 15:27, Prof. Robert Mathews (OSIA) wrote:
>
>> Looking at this issue with an 'interoperability lens,' I remain
>> puzzled by a personal observation that at least in the publicized
>> case of Duke University's Wi-Fi net being effected, the "ARP
>> storms" did not negatively impact network operations UNTIL the
>> presence of iPhones on campus. The nagging point in my mind
>> therefore, is: why have other Wi-Fi devices (laptops, HPCs/PDAs,
>> Smartphones etc.,) NOT caused the 'type' of ARP flooding, which
>> was made visible in Duke's Wi-Fi environment?
>
> Reading the Cisco document the conclusion seems obvious: the iPhone
> implements RFC 4436 unicast ARP packets which cause the problem.
>
> I don't have an iPhone on hand to test this and make sure, though.
>
> The difference between an iPhone and other devices (running Mac OS
> X?) that do the same thing would be that an iPhone is online while
> the user moves around, while laptops are generally put to sleep
> prior to moving around.
>
There is also the weird property of many types of "flood vulnerable"
systems that they seem to remain stable until some sort of threshold
is reached before suddenly spiraling out of control.
I am not sure of the exact mechanism behind this, but I have seen
multiple instances of this happening. The standard scenario is
basically:
You have a couple of switches with STP turned off -- someone plugs in
some random cable, forming a bridge loop....... and everything
continues running fine, until some time in the future when it all
goes to hell in a hand-basket. Now, I could understand the system
remaining stable until the first broadcast / unknown MAC caused
flooding to happen, but I have seen this system remain stable for
anywhere from a few days to in a few weeks before suddenly exploding.
I have seen the same thing happen in systems other than switches, for
example RIP networks with split-horizon turned off, weird frame-relay
networks, etc. Unfortunately I have never managed to recreate the
event in a controlled environment (In the few cases that I have cared
enough to try, I form a loop and everything goes BOOM immediately!),
and in the wild have always just fixed it and run away (its usually
someone else's network and I'm just helping out or visiting or
something). I HATE switched networks.....
A few observations:
In *almost* all of the cases, things *do* go boom immediately!
In the instances where they don't, there doesn't seem to be a
correlation between load and when it does suddenly spiral out of
control [0].
There is not a gradual increase increase in the sorts of packets that
you would expect to see cause this (in a switched environment, you do
not see flooded packets slowly increase, or even an exponential
increase over a long time, there is basically no traffic and then
boom! 100%).
Anyway, I have wondered that triggers it, but never enough to
actually look into much....
W
[0] Except for one case that I remember especially fondly -- it was
switched network with something like 30 switches scattered around --
someone had plugged one of those "silver satin" phone type cables
(untwisted copper) between two ports on a switch -- the cable was bad
enough that most of the frames were dropped / corrupted, but under
high broadcast traffic loads enough packets would make it through to
cause a flood, and then after some time (5-10 minutes) it would die
back down...
--
Never criticize a man till you've walked a mile in his shoes. Then
if he didn't like what you've said, he's a mile away and barefoot.
More information about the NANOG
mailing list