[Dnsmon-test] Feedback: data and backend

Gilles Massen gilles.massen at restena.lu
Thu Feb 13 17:20:33 CET 2014


Hi,

On 02/13/2014 04:05 PM, Robert Kisteleki wrote:

>> the difference come from? No single element seems to explain it
>> (frequency, different probes, geographic location), so what am I missing?
> 
> This is most likely caused by the fact that the new implementation is more
> forgiving, uses higher timeouts. In fact, it shows that many of the replies
> actually came in in 3-5 seconds(!); that shows up as green in the new
> version, and as red in the old.

Indeed, that would explain it. May I ask if the DNS tests do retries? Or
is there really an answer coming in after 5s? (didn't even know that was
possible...)

<just saw the email from Chris - that explains it>

> However, it's possible to show this event in the new system, using the
> relative RTT graphs -- see the attachment.

Thanks for pointing that out - I missed the actual usefulness of that view.

>> Server selection: could you please clarify what servers are monitored?
>> Those advertised by the parent, or in the zone? How often is it checked?
>> And what happens on a change? (is there a delay on removing void servers)?
> 
> For an initial server selection we use the list in the zone itself. At the
> moment we're not tracking changes on this -- but it's something worth
> investigating. Does it happen often? Should it be automatically changed?
> Would you expect a certain reaction time from the system?
> 
> (Background info: we're wondering if this is a frequent enough event for us
> to build processes around it.)

>From our point of view it is rare enough to have manual process and send
you an email. My guess is that a change (Server Name or IP address)
happens about 0-2 times a year per TLD. The old DNSMON changelog should
give a good indication, and I believe that Stéphane Bortzmeyer (among
others) is keeping a list of changes in the root zone. The real question
is probably what and how many domains will be covered by DNSMON. If it
will be open to the entire RIPE community (or only RIPE NCC members)
you'd probably need some automation.

But then again, you could always start manually and come up with
something if needed.


> We have anchors in the making in Africa, US, and Australia, amongst other
> places, therefore the bias will change over time and will converge more to
> what TTM (used to) use.
> 
> In the meantime, we opted not to use smaller/home probes. We tried, but it
> really affected the results. It also caused artificial difference in
> monitoring across different zones; the ones that used more flaky probes
> showed up as less stable, though they were not.
> 
> This lead us to use anchors only for these measurements.

Understood. Not sure if I like it, but then I can always build my own
UDMs :)

>> Access to raw data: it would be useful to have a way to locate the raw
>> data efficiently. I suppose the measurement id's are likely to change
>> over time, so either a fixed identifier (like Gert Goering suggested)
>> could help, or an API to retrieve things like "the msm_id's contributing
>> for <tld> dnsmon.". I'd humbly suggest to provide that rather quick: as
>> long as anycast instances are not visible via the user interface, it
>> would be helpful to retrieve at least the information from the
>> hostname.bind measurements without duplicating them.
> 
> We intend to keep the measurement IDs as stable as we can, even if we need
> to involve new probes in a measurement. However, you have a point that
> tagging would be even more useful. We'll implement that once we have proper
> support from the Atlas backend.

Good to know. As long a no tagging exists, could you simply publish the
measurement id's somewhere (maybe a greppable textfile, no points on
form here) - that would be nicer than scraping them from the links
inside the views. Even a cleaver search string for the API would do.

Best,
Gilles

-- 
Fondation RESTENA - DNS-LU
6, rue Coudenhove-Kalergi
L-1359 Luxembourg
tel: (+352) 424409
fax: (+352) 422473