yahoo crawlers hammering us

Harry Strongburg harry.nanog at
Wed Sep 8 05:54:55 UTC 2010

On Tue, Sep 07, 2010 at 04:19:58PM -0400, Ken Chase wrote:
> This makes it look like Yahoo is actually trafficking in pirated software, but
> that's kinda too funny to expect to be true, unless some yahoo tech decided to
> use that IP/server @yahoo for his nefarious activity, but there are better sites
> than my customer's box to get his 'juarez'.

It's not uncommon at all for a web-spider to find large files and 
download them. I don't think there's some conspiracy at Yahoo to find 
warez; they are just opperating as a normal spider, indexing the 

> ~500K/s (4Mbps+) for a 3 gig file is kinda... a bit harsh.

What speed would you like a spider to download at? You could configure 
the speeds to Yahoo's blocks server-side if you care enough. Ideally, 
request your customer doesn't throw large programs on there if you're 
concerned about bandwidth. 4 Mb/s isn't abnormal at all for a spider, 
and especially on a larger file.

> Is this expected/my own fault or what?

A little bit of both :)

