Monitoring other people's sites (Was: Website for returns "HTTP/1.1 500 Internal Server Error")

Nick Hilliard nick at
Tue Mar 20 10:53:42 CDT 2012

On 20/03/2012 14:54, Jeroen Massar wrote:
> For everybody who is "monitoring" other people's websites, please please
> please, monitor something static like /robots.txt as that can be
> statically served and is kinda appropriate as it is intended for robots.

Depends on what you are monitoring.  If you're looking for layer 4 ipv6
connectivity then robots.txt is fine.  If you're trying to determine
whether a site is serving active content on ipv6 and not serving http
errors, then it's pretty pointless to monitor robots.txt - you need to
monitor /.

> Oh and of course do set the User-Agent to something logical and to be
> super nice include a contact address so that people who do check their
> logs once in a while for fishy things they at least know what is
> happening there and that it is not a process run afoul or something.

Good policy, yes.  Some robots do this but others don't.

> Of course, asking before doing tends to be a good idea too.

Depends on the scale.  I'm not going to ask permission to poll someone
else's site every 5 minutes, and I would be surprised if they asked me the
same.  OTOH, if they were polling to the point that it was causing issues,
that might be different.

> The IPv6 Internet already consists way too much out of monitoring by
> pulling pages and doing pings...

"way too much" for what?  IPv6 is not widely adopted.

> Fortunately that should heavily change in a few months.

We've been saying this for years.  World IPv6 day 2012 will come and go,
and things are unlikely to change a whole lot.  The only thing that World
IPv6 day 2012 will ensure is that people whose ipv6 configuration actively
interferes with their daily Internet usage will be self-flagged and their
configuration issues can be dealt with.

>  (who noticed a certain s....h company performing latency checks against
> one of his sites, which was no problem, but the fact that they where
> causing almost more hits/traffic/load than normal clients was a bit on
> the much side

If that web page is configured to be as top-heavy as this, then I'd suggest
putting a cache in front of it. nginx is good for this sort of thing.


More information about the NANOG mailing list