Update from the NANOG Communications Committee regarding recent off-topic posts

Robert Drake rdrake at direcpath.com
Thu Aug 2 20:25:56 UTC 2012


On 7/30/2012 1:42 PM, Patrick W. Gilmore wrote:
> I'm sorry Panashe is upset by this rule.  Interestingly, "Your search - Panashe Flack nanog - did not match any documents."  So my guess is that a post from that account has not happened before, meaning the post was moderated yet still made it through.
>
> Has anyone done a data mining experiment to see how many posts a month are from "new" members?  My guess is it is a trivial percentage.
>

Ignoring many harder to determine things like "who has changed their 
email address" and reducing it to simple shell commands, I got this:

for i in `cat ../nanog_archive_index.html | grep txt | cut -f2 -d\"` ; 
do wget http://mailman.nanog.org/pipermail/nanog/$i; done
du -sh=41M (uncompressed=100M).  That seems small for all the mail since 
random 2007 but I'd rather use an official archive so people can 
duplicate results and refine things.
  grep -h "^From: " * |  sort | uniq -c | sort -nr

First of all I will say Owen is winning by a fair margin:

    1562 From: owen at delong.com (Owen DeLong)
     929 From: randy at psg.com (Randy Bush)
     775 From: Valdis.Kletnieks at vt.edu (Valdis.Kletnieks at vt.edu)
     688 From: morrowc.lists at gmail.com (Christopher Morrow)
     621 From: jbates at brightok.net (Jack Bates)
     558 From: jra at baylink.com (Jay Ashworth)
     480 From: gbonser at seven.com (George Bonser)
     450 From: patrick at ianai.net (Patrick W. Gilmore)
     446 From: cidr-report at potaroo.net (cidr-report at potaroo.net)

Total count:
grep -h "^From: " * | wc -l
54166

# Totals for < 10 contributors
for i in 1 2 3 4 5 6 7 8 9; do grep -h "^From: " * | sort | uniq -c | 
sort -nr | grep "      $i" | wc -l; done
3129
1111
552
319
208
157
131
103
94

Total for less than 10 posts contributors:  5804

Percentages:  5804/54166=1% of posts from low contributors.

# shows the number of people who've contributed that number of times.
grep -h "^From: " * | sort | uniq -c | sort -nr | awk '{print $1}' | 
uniq -c | sort -nr

# another interesting thing to look at is posts by month per user 
(dropping the -h from grep):
grep "^From: " * | sort | uniq -c | sort -nr

# not the most efficient, but tells you who posted the most in a month:
for i in *; do grep "^From: " * | sort | uniq -c | sort -nr | grep $i | 
head -n 1; done

# Per month, how many single post contributions happen/total.  The 
numbers can be higher here since people who posted in a different month 
may still be counted as a new contributor
  for i in *; do echo -n "$i "; grep "^From: " $i | sort | uniq -c | 
sort -nr | grep "      1 " | wc -l | tr '\n' '/'; grep "^From: " $i | wc 
-l ; done






More information about the NANOG mailing list