iPhone and Network Disruptions ...

Wed Jul 25 20:16:47 UTC 2007

On Jul 24, 2007, at 5:34 PM, Iljitsch van Beijnum wrote:

>
> On 24-jul-2007, at 15:27, Prof. Robert Mathews (OSIA) wrote:
>
>> Looking at this issue with an 'interoperability lens,' I remain  
>> puzzled by a personal observation that at least in the publicized  
>> case of Duke University's Wi-Fi net being effected, the "ARP  
>> storms" did not negatively impact network operations UNTIL the  
>> presence of iPhones on campus.  The nagging point in my mind  
>> therefore, is: why have other Wi-Fi devices (laptops, HPCs/PDAs,  
>> Smartphones etc.,) NOT caused the 'type' of ARP flooding, which  
>> was made visible in Duke's Wi-Fi environment?
>
> Reading the Cisco document the conclusion seems obvious: the iPhone  
> implements RFC 4436 unicast ARP packets which cause the problem.
>
> I don't have an iPhone on hand to test this and make sure, though.
>
> The difference between an iPhone and other devices (running Mac OS  
> X?) that do the same thing would be that an iPhone is online while  
> the user moves around, while laptops are generally put to sleep  
> prior to moving around.
>

There is also the weird property of many types of "flood vulnerable"  
systems that they seem to remain stable until some sort of threshold  
is reached before suddenly spiraling out of control.

I am not sure of the exact mechanism behind this, but I have seen  
multiple instances of this happening. The standard scenario is  
basically:

You have a couple of switches with STP turned off -- someone plugs in  
some random cable, forming a bridge loop....... and everything  
continues running fine, until some time in the future when it all  
goes to hell in a hand-basket. Now, I could understand the system  
remaining stable until the first  broadcast / unknown MAC caused  
flooding to happen, but I have seen this system remain stable for  
anywhere from a few days to in a few weeks before suddenly exploding.

I have seen the same thing happen in systems other than switches, for  
example RIP networks with split-horizon turned off, weird frame-relay  
networks, etc. Unfortunately I have never managed to recreate the  
event in a controlled environment (In the few cases that I have cared  
enough to try, I form a loop and everything goes BOOM immediately!),  
and in the wild have always just fixed it and run away (its usually  
someone else's network and I'm just helping out or visiting or  
something). I HATE switched networks.....

A few observations:
In *almost* all of the cases, things *do* go boom immediately!
In the instances where they don't, there doesn't seem to be a  
correlation between load and when it does suddenly spiral out of  
control [0].
There is not a gradual increase increase in the sorts of packets that  
you would expect to see cause this (in a switched environment, you do  
not see flooded packets slowly increase, or even an exponential  
increase over a long time, there is basically no traffic and then  
boom! 100%).

Anyway, I have wondered that triggers it, but never enough to  
actually look into much....

W

[0] Except for one case that I remember especially fondly -- it was  
switched network with something like 30 switches scattered around --  
someone had plugged one of those "silver satin" phone type cables  
(untwisted copper) between two ports on a switch -- the cable was bad  
enough that most of the frames were dropped / corrupted, but under  
high broadcast traffic loads enough packets would make it through to  
cause a flood, and then after some time (5-10 minutes) it would die  
back down...

-- 
Never criticize a man till you've walked a mile in his shoes.  Then  
if he didn't like what you've said, he's a mile away and barefoot.