Re: anycast stability experiment


in november, we asked ops folk everywhere to be kind enough to
participate in an experiment.  the request message is appended.

we promised to report results.  well, there were fun events,
such as northern-hemisphere winter holidays etc, which got in
the way.  but i just presented some *very* preliminary results
at the apnic meeting in kyoto.  the presentation is viewable
at

    <http://rip.psg.com/~randy/050223.anycast-apnic.pdf>

once again, these results are *very* preliminary, so take them
with a drop of shoyu.

and thanks ever so much for those kind folk who ran the probes.
we may call on you again.

randy


---


From: Randy Bush randy@localhost
To: ops sheep willing to be censored by a non-op nanog@localhost,
	apops@localhost, sig-routing@localhost, routing-wg@localhost,
	afnog@localhost
Subject: anycast stability experiment
Date: Tue, 16 Nov 2004 16:27:46 -0800

the verisign gang gave a good presentation of "Life and Times of
J-Root" at the recent nanog meeting' see
    <http://www.nanog.org/mtg-0410/pdf/kosters.pdf>.  
on foils 27 to 29, they reported non-trivial routing jitter and
therefore suggested "DO NOT RUN anycast with stateful transport."

on the other hand, for almost a decade, there have been reports of
successfully delivery of stateful services over anycast.  so one
wonders if their measurement was from an abnormal vantage point, or
if there are other interesting things going on.

as we are curious and inclined to test conjectures by experiment,
scientific method and all that, we devised a small experiment in
which we would very much appreciate your participation.

we wish to collect data on the stability of which actual anycast
root server(s) you contact from many points on or near the net's
edges, i.e.. your home, datacenter, ....  to do this, peter boothe,
of university of oregon, has written a script which queries those
roots which are known to be widely anycast every few seconds using

   dig [options] @<foo>.root-servers.net. hostname.bind chaos txt

to record which actual unicast-named server was reached.  these
probes are recorded in a file that looks like, for example

    42.666.7.11
    Sat Nov 13 01:29:29 UTC 2004 f UDP pao1e.f.root-servers.org
    Sat Nov 13 01:29:29 UTC 2004 i UDP s1.lnx
    Sat Nov 13 01:29:30 UTC 2004 j UDP jns4-kgtld.j.root-servers.net
    Sat Nov 13 01:29:30 UTC 2004 k UDP k1.linx
    Sat Nov 13 01:29:30 UTC 2004 m UDP M-d3
    Sat Nov 13 01:29:30 UTC 2004 c TCP lax1a.c.root-servers.org
    Sat Nov 13 01:29:31 UTC 2004 f TCP pao1c.f.root-servers.org
    Sat Nov 13 01:29:31 UTC 2004 i TCP s1.lnx
    Sat Nov 13 01:29:31 UTC 2004 j TCP jns1-kgtld.j.root-servers.net
    Sat Nov 13 01:29:32 UTC 2004 k TCP k1.linx
    Sat Nov 13 01:29:33 UTC 2004 m TCP M-d3
    Sat Nov 13 01:29:35 UTC 2004 c UDP lax1a.c.root-servers.org
    Sat Nov 13 01:29:35 UTC 2004 f UDP pao1e.f.root-servers.org
    Sat Nov 13 01:29:35 UTC 2004 i UDP s1.lnx
    Sat Nov 13 01:29:35 UTC 2004 j UDP jns3-kgtld.j.root-servers.net
    Sat Nov 13 01:29:36 UTC 2004 k UDP k1.linx
    Sat Nov 13 01:29:36 UTC 2004 m UDP M-d3

which is mailed off to peter's collector once a day.

note that your public ip address is the first thing in the file.
this is learned by wget (or telnet to port 80, in case you do not
have wget installed) to peter's server at oregon, which responds
with the public ip address from which you came.  this serves two
purposes:

  o we may want to know from where measurements were taken so we
    can look at routing data to see why anomalies or normalities
    occurred.

  o we are concerned about overloading the root servers against
    which this experiment is running.  therefore, the script will
    not actually perform the probes unless peter's server responds
    to the ip address query.  this allows peter to throttle or
    disable the experiment should 10,000 people decide to run it.
    it might also be used to select probers to concentrate on
    'interesting' probe locales in the internet topology.

it's just a short shell script, so you should be able to easily
feel comfortable that it will not make you vulnerable to anthing
other than possible expulsion from the nanog mailing list :-).  as
it probes only a select list of anycast root servers, and does so
every two seconds, this should not place any significant load on
the servers or your machine.

as our conjecture involves verisign's implication of stateful,
e.g., tcp, connections, early reviewers of the experimental setup
pushed us to test tcp as well as udp queries.  as a tcp query is
estimated to have many more packets than a udp query, the current
code does one tcp probe every ten udp queries, i.e., around once
every 20 seconds.

we would really appreciate it if you would run this for a day or
three.  a tarball with the script and a readme can be found at

   <http://rip.psg.com/~randy/anycast_gatherer-1.4.tar.gz>

[ fyi, my personal bet is that it's usually pretty stable except
for sites in strangely unstable routing environments.  but i am
under my quota of wrong for the week. ]

thanks!  and many thanks to the early reviewers of the experiment.

randy, peter, and friends