news from Google
jcdill.lists at gmail.com
Fri Dec 11 21:38:18 CST 2009
Seth Mattinen wrote:
> JC Dill wrote:
>> Seth Mattinen wrote:
>>> Hell, all you gmail users on this list right now are feeding the
>>> machine with all our data.
>>> The part that gets me: everyone seems happy with this.
>> This list has public archives that are already crawled and archived
>> by Google. For example:
>> Subscribing to the list with a gmail account doesn't change anything
>> about what Google knows about the list or list members.
> Those URL's don't seem to include "google.com" in them. Maybe I'm
> misreading them.
I *found* them by searching with Google. I found the second link by
searching for a unique phrase from your email:
A mere 1 hour after you emailed it to the NANOG list, Google web search
has that email archived from the website on seclists.org.
> Crawlers can be excluded with robots.txt if so chosen by the site
> owner so long as google respects said file.
Google does respect that file, but you are counting on other subscribers
respecting the site owner's wishes regarding web archives. In my
experience, this has become a futile fight. If the list doesn't have a
web accessible archive, it's likely one of the list's subscribers might
start their own archive or have it archived with one of the many archive
sites e.g. gmane.
> Some lists also respect a "no archive" header that some people choose
> to include with their messages.
If you are emailing a publicly archived mailing list that you know is
web archived and likely spidered by Google, a "no archive" header is
mostly useless. When someone replies to your email (as I'm doing now)
your quoted text in the reply will be archived, preserving what you
posted to the list. At best, the "no archive" header merely messes up
threading. The "no archive" header idea never really worked in the
first place - witness all the old usenet server posts that ended up on
dejagoogle even when the posts had "no archive" headers.
> Preventing my email to gmail from entering their vast database of
> whatever they track doesn't have any such control features that I'm
> aware of.
Preventing any email you send to anyone from being leaked out to the
public is something you have no control of. I.e. the CRU hacked email
controversy. If you don't want what you write to be posted on or
archived on the internet and findable with web searches, don't use the
internet to write or transmit it. Even then, you are at risk of someone
scanning and posting what you write. As a NANOG subscriber you should
be clueful enough to know all of this already. So what's the big issue
More information about the NANOG