wow, lots of akamai

Charles Polisher chas at chasmo.org
Tue Apr 6 16:16:38 UTC 2021


On 4/5/21 10:23 PM, Robert Brockway wrote:
> On Thu, 1 Apr 2021, Jean St-Laurent via NANOG wrote:
>
>> What happened is that it would create a kind of internal DDoS and 
>> they would all timed out and give a weird error message. Something 
>> very useful like Error Code 0x8098808 Please call our support line at 
>> this phone number.
>
> If only there was a way to address the Thundering Herd problem before 
> the cloud. :)
>
>> This simple change to add 3 lines of code to add a random artificial 
>> boot penalty of few seconds, completely solve the problem.
>
> Bingo.  Now, the trick is to catch this before it causes an self-DDoS.
>
> This is a problem that has been recognised for decades and this is 
> unfortunately a good example of how operational experience is still 
> not being distributed properly.  Too many managers think that 
> operational work is obvious and just a result of common sense.  It isn't.

Same problem as disk drives powering up simultaneously
in datacenters. SCSI drives have (had?) a random delay
mechanism to distribute the initial power surge over a few
seconds.



More information about the NANOG mailing list