bortzmeyer at nic.fr
Mon Jul 17 09:07:32 UTC 2006
On Wed, Jul 12, 2006 at 06:24:08PM -0400,
Jim Popovitch <jimpop at yahoo.com> wrote
a message of 32 lines which said:
> The strangeness is that some of their crawling is looking for URLs
> with multiple exclamation points, those URLs never existed. This may
> be indicative of a character translation on my system or theirs.
>From my experience (and I talked with people - or at least intelligent
bots - at Gigablast), their HTML parser is seriously broken and it
generates non-existing URL quite often. For instance <a
href="http://www.example.fr/Cafe%20au%20lait"> will make their crawler
ask for "/Cafe".
I reported the problem months ago but I got nothing except standard
"Thanks for telling us".
More information about the NANOG