[Dnsmon-test] Feedback: data and backend

Gilles Massen gilles.massen at restena.lu
Thu Feb 13 14:55:52 CET 2014


Hello,

As promised, some questions / comments on the data and measurements.
Again, in no particular order.

Data quality: as an exercise I tried to investigate errors and speckles
seen on the TTM-DNSMON on the Atlas-DNSMON. And I failed completely. So
either I'm thick, or there is a fundamental difference between the old
and new systems that really needs to be spelled out. Just to have an
example: this graph (
http://dnsmon.ripe.net/dns-servmon/server/plot?type=drops&server=k.dns.lu&af=ipv4&day=13&month=2&year=2014&hour=6&minutes=0&period=lastval&plot=SHOW
) shows many yellow spots. A similar view on Atlas-DNSMON (no link
(hint,hint)) shows a perfect world (no unanswered queries). Where does
the difference come from? No single element seems to explain it
(frequency, different probes, geographic location), so what am I missing?

Server selection: could you please clarify what servers are monitored?
Those advertised by the parent, or in the zone? How often is it checked?
And what happens on a change? (is there a delay on removing void servers)?

Probe quantity and location: the use of atlas anchors is certainly a
good idea. However given the quantity and locations of anchors (almost
nothing beyond Europe) I'd like to suggest to include a few normal
probes in the underrepresented regions, and phase them out when more
anchors become available. They could be handpicked based on uptime and
connection quality. The current view is really too poor compared to the
TTM view.

Access to raw data: it would be useful to have a way to locate the raw
data efficiently. I suppose the measurement id's are likely to change
over time, so either a fixed identifier (like Gert Goering suggested)
could help, or an API to retrieve things like "the msm_id's contributing
for <tld> dnsmon.". I'd humbly suggest to provide that rather quick: as
long as anycast instances are not visible via the user interface, it
would be helpful to retrieve at least the information from the
hostname.bind measurements without duplicating them.

Frequency: if  frequency of measurement is discussed, I'd prefer rather
frequent SOA queries (certainly better than 1/5min) over hostname.bind
queries, for the simple reason that I'd suspect routing to be more
stable than DNS data (hand waving...). To me the the data propagation
delay certainly is an interesting data point.

That's it for the time being. Feel free to criticize!

Best,
Gilles

-- 
Fondation RESTENA - DNS-LU
6, rue Coudenhove-Kalergi
L-1359 Luxembourg
tel: (+352) 424409
fax: (+352) 422473