Traceroute and random UDP ports

Joe Abley jabley at ca.afilias.info
Wed Aug 13 10:24:37 CDT 2008


On 13 Aug 2008, at 08:56, John Kristoff wrote:

> For further information I
> sugguest consulting Stevens TCP/IP Illustrated chapter 8, dated, but
> still an indispensable resource.

... or the comments in Van's traceroute.c, which are pleasantly  
educational.


Joe

/*
  * traceroute host  - trace the route ip packets follow going to  
"host".
  *
  * Attempt to trace the route an ip packet would follow to some
  * internet host.  We find out intermediate hops by launching probe
  * packets with a small ttl (time to live) then listening for an
  * icmp "time exceeded" reply from a gateway.  We start our probes
  * with a ttl of one and increase by one until we get an icmp "port
  * unreachable" (which means we got to "host") or hit a max (which
  * defaults to net.inet.ip.ttl hops & can be changed with the -m flag).
  * Three probes (change with -q flag) are sent at each ttl setting and
  * a line is printed showing the ttl, address of the gateway and
  * round trip time of each probe.  If the probe answers come from
  * different gateways, the address of each responding system will
  * be printed.  If there is no response within a 5 sec. timeout
  * interval (changed with the -w flag), a "*" is printed for that
  * probe.
  *
  * Probe packets are UDP format.  We don't want the destination
  * host to process them so the destination port is set to an
  * unlikely value (if some clod on the destination is using that
  * value, it can be changed with the -p flag).
  *
  * A sample use might be:
  *
  *     [yak 71]% traceroute nis.nsf.net.
  *     traceroute to nis.nsf.net (35.1.1.48), 64 hops max, 56 byte  
packet
  *      1  helios.ee.lbl.gov (128.3.112.1)  19 ms  19 ms  0 ms
  *      2  lilac-dmc.Berkeley.EDU (128.32.216.1)  39 ms  39 ms  19 ms
  *      3  lilac-dmc.Berkeley.EDU (128.32.216.1)  39 ms  39 ms  19 ms
  *      4  ccngw-ner-cc.Berkeley.EDU (128.32.136.23)  39 ms  40 ms   
39 ms
  *      5  ccn-nerif22.Berkeley.EDU (128.32.168.22)  39 ms  39 ms  39  
ms
  *      6  128.32.197.4 (128.32.197.4)  40 ms  59 ms  59 ms
  *      7  131.119.2.5 (131.119.2.5)  59 ms  59 ms  59 ms
  *      8  129.140.70.13 (129.140.70.13)  99 ms  99 ms  80 ms
  *      9  129.140.71.6 (129.140.71.6)  139 ms  239 ms  319 ms
  *     10  129.140.81.7 (129.140.81.7)  220 ms  199 ms  199 ms
  *     11  nic.merit.edu (35.1.1.48)  239 ms  239 ms  239 ms
  *
  * Note that lines 2 & 3 are the same.  This is due to a buggy
  * kernel on the 2nd hop system -- lbl-csam.arpa -- that forwards
  * packets with a zero ttl.
  *
  * A more interesting example is:
  *
  *     [yak 72]% traceroute allspice.lcs.mit.edu.
  *     traceroute to allspice.lcs.mit.edu (18.26.0.115), 64 hops max
  *      1  helios.ee.lbl.gov (128.3.112.1)  0 ms  0 ms  0 ms
  *      2  lilac-dmc.Berkeley.EDU (128.32.216.1)  19 ms  19 ms  19 ms
  *      3  lilac-dmc.Berkeley.EDU (128.32.216.1)  39 ms  19 ms  19 ms
  *      4  ccngw-ner-cc.Berkeley.EDU (128.32.136.23)  19 ms  39 ms   
39 ms
  *      5  ccn-nerif22.Berkeley.EDU (128.32.168.22)  20 ms  39 ms  39  
ms
  *      6  128.32.197.4 (128.32.197.4)  59 ms  119 ms  39 ms
  *      7  131.119.2.5 (131.119.2.5)  59 ms  59 ms  39 ms
  *      8  129.140.70.13 (129.140.70.13)  80 ms  79 ms  99 ms
  *      9  129.140.71.6 (129.140.71.6)  139 ms  139 ms  159 ms
  *     10  129.140.81.7 (129.140.81.7)  199 ms  180 ms  300 ms
  *     11  129.140.72.17 (129.140.72.17)  300 ms  239 ms  239 ms
  *     12  * * *
  *     13  128.121.54.72 (128.121.54.72)  259 ms  499 ms  279 ms
  *     14  * * *
  *     15  * * *
  *     16  * * *
  *     17  * * *
  *     18  ALLSPICE.LCS.MIT.EDU (18.26.0.115)  339 ms  279 ms  279 ms
  *
  * (I start to see why I'm having so much trouble with mail to
  * MIT.)  Note that the gateways 12, 14, 15, 16 & 17 hops away
  * either don't send ICMP "time exceeded" messages or send them
  * with a ttl too small to reach us.  14 - 17 are running the
  * MIT C Gateway code that doesn't send "time exceeded"s.  God
  * only knows what's going on with 12.
  *
  * The silent gateway 12 in the above may be the result of a bug in
  * the 4.[23]BSD network code (and its derivatives):  4.x (x <= 3)
  * sends an unreachable message using whatever ttl remains in the
  * original datagram.  Since, for gateways, the remaining ttl is
  * zero, the icmp "time exceeded" is guaranteed to not make it back
  * to us.  The behavior of this bug is slightly more interesting
  * when it appears on the destination system:
  *
  *      1  helios.ee.lbl.gov (128.3.112.1)  0 ms  0 ms  0 ms
  *      2  lilac-dmc.Berkeley.EDU (128.32.216.1)  39 ms  19 ms  39 ms
  *      3  lilac-dmc.Berkeley.EDU (128.32.216.1)  19 ms  39 ms  19 ms
  *      4  ccngw-ner-cc.Berkeley.EDU (128.32.136.23)  39 ms  40 ms   
19 ms
  *      5  ccn-nerif35.Berkeley.EDU (128.32.168.35)  39 ms  39 ms  39  
ms
  *      6  csgw.Berkeley.EDU (128.32.133.254)  39 ms  59 ms  39 ms
  *      7  * * *
  *      8  * * *
  *      9  * * *
  *     10  * * *
  *     11  * * *
  *     12  * * *
  *     13  rip.Berkeley.EDU (128.32.131.22)  59 ms !  39 ms !  39 ms !
  *
  * Notice that there are 12 "gateways" (13 is the final
  * destination) and exactly the last half of them are "missing".
  * What's really happening is that rip (a Sun-3 running Sun OS3.5)
  * is using the ttl from our arriving datagram as the ttl in its
  * icmp reply.  So, the reply will time out on the return path
  * (with no notice sent to anyone since icmp's aren't sent for
  * icmp's) until we probe with a ttl that's at least twice the path
  * length.  I.e., rip is really only 7 hops away.  A reply that
  * returns with a ttl of 1 is a clue this problem exists.
  * Traceroute prints a "!" after the time if the ttl is <= 1.
  * Since vendors ship a lot of obsolete (DEC's Ultrix, Sun 3.x) or
  * non-standard (HPUX) software, expect to see this problem
  * frequently and/or take care picking the target host of your
  * probes.
  *
  * Other possible annotations after the time are !H, !N, !P (got a  
host,
  * network or protocol unreachable, respectively), !S or !F (source
  * route failed or fragmentation needed -- neither of these should
  * ever occur and the associated gateway is busted if you see one).  If
  * almost all the probes result in some kind of unreachable, traceroute
  * will give up and exit.
  *
  * Notes
  * -----
  * This program must be run by root or be setuid.  (I suggest that
  * you *don't* make it setuid -- casual use could result in a lot
  * of unnecessary traffic on our poor, congested nets.)
  *
  * This program requires a kernel mod that does not appear in any
  * system available from Berkeley:  A raw ip socket using proto
  * IPPROTO_RAW must interpret the data sent as an ip datagram (as
  * opposed to data to be wrapped in a ip datagram).  See the README
  * file that came with the source to this program for a description
  * of the mods I made to /sys/netinet/raw_ip.c.  Your mileage may
  * vary.  But, again, ANY 4.x (x < 4) BSD KERNEL WILL HAVE TO BE
  * MODIFIED TO RUN THIS PROGRAM.
  *
  * The udp port usage may appear bizarre (well, ok, it is bizarre).
  * The problem is that an icmp message only contains 8 bytes of
  * data from the original datagram.  8 bytes is the size of a udp
  * header so, if we want to associate replies with the original
  * datagram, the necessary information must be encoded into the
  * udp header (the ip id could be used but there's no way to
  * interlock with the kernel's assignment of ip id's and, anyway,
  * it would have taken a lot more kernel hacking to allow this
  * code to set the ip id).  So, to allow two or more users to
  * use traceroute simultaneously, we use this task's pid as the
  * source port (the high bit is set to move the port number out
  * of the "likely" range).  To keep track of which probe is being
  * replied to (so times and/or hop counts don't get confused by a
  * reply that was delayed in transit), we increment the destination
  * port number before each probe.
  *
  * Don't use this as a coding example.  I was trying to find a
  * routing problem and this code sort-of popped out after 48 hours
  * without sleep.  I was amazed it ever compiled, much less ran.
  *
  * I stole the idea for this program from Steve Deering.  Since
  * the first release, I've learned that had I attended the right
  * IETF working group meetings, I also could have stolen it from Guy
  * Almes or Matt Mathis.  I don't know (or care) who came up with
  * the idea first.  I envy the originators' perspicacity and I'm
  * glad they didn't keep the idea a secret.
  *
  * Tim Seaver, Ken Adelman and C. Philip Wood provided bug fixes and/or
  * enhancements to the original distribution.
  *
  * I've hacked up a round-trip-route version of this that works by
  * sending a loose-source-routed udp datagram through the destination
  * back to yourself.  Unfortunately, SO many gateways botch source
  * routing, the thing is almost worthless.  Maybe one day...
  *
  *  -- Van Jacobson (van at ee.lbl.gov)
  *     Tue Dec 20 03:50:13 PST 1988
  */





More information about the NANOG mailing list