anycast stability experiment
Randy Bush randy at psg.com
Thu Feb 24 07:43:36 CET 2005
in november, we asked ops folk everywhere to be kind enough to participate in an experiment. the request message is appended. we promised to report results. well, there were fun events, such as northern-hemisphere winter holidays etc, which got in the way. but i just presented some *very* preliminary results at the apnic meeting in kyoto. the presentation is viewable at <http://rip.psg.com/~randy/050223.anycast-apnic.pdf> once again, these results are *very* preliminary, so take them with a drop of shoyu. and thanks ever so much for those kind folk who ran the probes. we may call on you again. randy --- From: Randy Bush <randy at psg.com> To: ops sheep willing to be censored by a non-op <nanog at nanog.org>, apops at apops.net, sig-routing at apnic.net, routing-wg at ripe.net, afnog at afnog.org Subject: anycast stability experiment Date: Tue, 16 Nov 2004 16:27:46 -0800 the verisign gang gave a good presentation of "Life and Times of J-Root" at the recent nanog meeting' see <http://www.nanog.org/mtg-0410/pdf/kosters.pdf>. on foils 27 to 29, they reported non-trivial routing jitter and therefore suggested "DO NOT RUN anycast with stateful transport." on the other hand, for almost a decade, there have been reports of successfully delivery of stateful services over anycast. so one wonders if their measurement was from an abnormal vantage point, or if there are other interesting things going on. as we are curious and inclined to test conjectures by experiment, scientific method and all that, we devised a small experiment in which we would very much appreciate your participation. we wish to collect data on the stability of which actual anycast root server(s) you contact from many points on or near the net's edges, i.e.. your home, datacenter, .... to do this, peter boothe, of university of oregon, has written a script which queries those roots which are known to be widely anycast every few seconds using dig [options] @<foo>.root-servers.net. hostname.bind chaos txt to record which actual unicast-named server was reached. these probes are recorded in a file that looks like, for example 42.666.7.11 Sat Nov 13 01:29:29 UTC 2004 f UDP pao1e.f.root-servers.org Sat Nov 13 01:29:29 UTC 2004 i UDP s1.lnx Sat Nov 13 01:29:30 UTC 2004 j UDP jns4-kgtld.j.root-servers.net Sat Nov 13 01:29:30 UTC 2004 k UDP k1.linx Sat Nov 13 01:29:30 UTC 2004 m UDP M-d3 Sat Nov 13 01:29:30 UTC 2004 c TCP lax1a.c.root-servers.org Sat Nov 13 01:29:31 UTC 2004 f TCP pao1c.f.root-servers.org Sat Nov 13 01:29:31 UTC 2004 i TCP s1.lnx Sat Nov 13 01:29:31 UTC 2004 j TCP jns1-kgtld.j.root-servers.net Sat Nov 13 01:29:32 UTC 2004 k TCP k1.linx Sat Nov 13 01:29:33 UTC 2004 m TCP M-d3 Sat Nov 13 01:29:35 UTC 2004 c UDP lax1a.c.root-servers.org Sat Nov 13 01:29:35 UTC 2004 f UDP pao1e.f.root-servers.org Sat Nov 13 01:29:35 UTC 2004 i UDP s1.lnx Sat Nov 13 01:29:35 UTC 2004 j UDP jns3-kgtld.j.root-servers.net Sat Nov 13 01:29:36 UTC 2004 k UDP k1.linx Sat Nov 13 01:29:36 UTC 2004 m UDP M-d3 which is mailed off to peter's collector once a day. note that your public ip address is the first thing in the file. this is learned by wget (or telnet to port 80, in case you do not have wget installed) to peter's server at oregon, which responds with the public ip address from which you came. this serves two purposes: o we may want to know from where measurements were taken so we can look at routing data to see why anomalies or normalities occurred. o we are concerned about overloading the root servers against which this experiment is running. therefore, the script will not actually perform the probes unless peter's server responds to the ip address query. this allows peter to throttle or disable the experiment should 10,000 people decide to run it. it might also be used to select probers to concentrate on 'interesting' probe locales in the internet topology. it's just a short shell script, so you should be able to easily feel comfortable that it will not make you vulnerable to anthing other than possible expulsion from the nanog mailing list :-). as it probes only a select list of anycast root servers, and does so every two seconds, this should not place any significant load on the servers or your machine. as our conjecture involves verisign's implication of stateful, e.g., tcp, connections, early reviewers of the experimental setup pushed us to test tcp as well as udp queries. as a tcp query is estimated to have many more packets than a udp query, the current code does one tcp probe every ten udp queries, i.e., around once every 20 seconds. we would really appreciate it if you would run this for a day or three. a tarball with the script and a readme can be found at <http://rip.psg.com/~randy/anycast_gatherer-1.4.tar.gz> [ fyi, my personal bet is that it's usually pretty stable except for sites in strangely unstable routing environments. but i am under my quota of wrong for the week. ] thanks! and many thanks to the early reviewers of the experiment. randy, peter, and friends