STILL Paging Google...

Michael.Dillon at btradianz.com Michael.Dillon at btradianz.com
Wed Nov 16 14:39:04 UTC 2005


matthew at elvey.com (Matthew Elvey) [Wed 16 Nov 2005, 01:56 CET]:
>Still no word from google, or indication that there's anything wrong 
>with the robots.txt.  Google's estimated hit count is going slightly up, 
>instead of way down.

Way back in the early '90's someone came up with an
elegant solution to this problem. When building a site
in a folder named /httproot, all dynamic pages, i.e.
scripts, were placed in a folder named /httproot/cgi-bin
Then somebody invented robots.txt to allow people to
tell spiders to leave the cgi-bin folder alone.

Sites which follow the ancient paradigm do not run
into these kinds of problems. Some people would say that
asking the world to re-engineer the robots.txt protocol
instead of building sites compliant with the protocol,
is in violation of the robustness principle as expressed
by Jon Postel in RFC 793 section 2.10 and reiterated in 
section 4.5 of RFC 3117.

When something doesn't work, the correct operational
response is to fix it.

--Michael Dillon




More information about the NANOG mailing list