This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/[email protected]/
[atlas] Actual measurement interval much larger than planned
- Previous message (by thread): [atlas] Actual measurement interval much larger than planned
- Next message (by thread): [atlas] Out of date system probe tags
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Wenqin SHAO
wenqin.shao at telecom-paristech.fr
Thu Oct 20 09:21:18 CEST 2016
Hi Andreas, Thank you for showing interest and looking into this case. Here attached is a non-exhaustive list of the probes and corresponding time vicinity where missing happens for built-in ping toward b-root (mom id 1010). Hope it could be useful. Regards, wenqin > On 19 Oct 2016, at 14:28, Andreas Strikos <astrikos at ripe.net> wrote: > > Hi Wenqin, > > that was a weird case where specific probe has time synchronization issues. > We saw a lot of debug messages in our raw logs coming from probe complaining about time sync issues. > If you have more examples including other probes please send us details and we will happily check it for you. > > Regards, > Andreas > > On 16/10/16 22:23, Wenqin SHAO wrote: >> Hi list, >> >> A couple of weeks ago, I asked in this thread why some built-in measurements are missing, or not performed at scheduled interval. >> Cristel and Robert very kindly shared what they thought could be the causes: scheduling, probe reboot, updating task list, etc. >> >> The discussion gave me the idea to verify if there are as well missing measurements while the probe is powered and connected to an Atlas controller, i.e. probe seemly works in a good condition. >> >> Here, I would like to share one case among many I’ve observed where built-in measurements are missed continuously for a long time, even when the probe is well connected to a controller. >> >> Let’s look at a time window from '2016-06-16 21:53:20 +0000’ to '2016-06-18 20:16:40 +0000’ for probe 22144. >> >> First, I queried its connection events to Atlas controller (msm_id 7000). >> The result says the probe connected to a controller at '2016-06-16 21:54:19 +0000’ and became disconnected at '2016-06-18 20:13:48 +0000’. Between these two moments, the probe is supposed to remain connected, and thus continuously powered. >> >> Then, I queried the built-in ping measurements toward b-root (msm_id 1010) within the time window. >> Here below the timestamps at which measurements are performed. >> [‘2016-06-16 22:51:08 +0000’, '2016-06-17 02:07:08 +0000’, '2016-06-17 03:11:02 +0000', >> '2016-06-17 04:03:07 +0000’, '2016-06-17 05:23:03 +0000’, '2016-06-17 07:35:17 +0000', >> '2016-06-17 10:51:06 +0000’, '2016-06-17 14:07:04 +0000’, '2016-06-17 15:11:03 +0000’, >> '2016-06-17 17:23:06 +0000’, '2016-06-17 18:27:04 +0000’, '2016-06-17 20:39:04 +0000’, >> '2016-06-17 22:51:09 +0000’, '2016-06-17 23:55:03 +0000’, '2016-06-18 02:07:08 +0000', >> '2016-06-18 03:11:04 +0000’, '2016-06-18 06:27:05 +0000’, '2016-06-18 08:39:02 +0000', >> '2016-06-18 10:27:13 +0000’, '2016-06-18 10:51:08 +0000’, '2016-06-18 11:55:13 +0000', >> '2016-06-18 12:03:13 +0000’, '2016-06-18 15:11:04 +0000’, '2016-06-18 17:23:05 +0000', >> '2016-06-18 18:27:09 +0000’, '2016-06-18 18:43:13 +0000’, '2016-06-18 19:35:18 +0000', >> '2016-06-18 20:15:06 +0000’] >> We can see the intervals between neighbouring measurements are much larger than the planned value 240sec. >> >> I investigated as well other built-in ping measurements, say toward k-root (msm_id 1001). >> Here below are the timestamps: >> ['2016-06-16 22:51:08 +0000’, '2016-06-17 02:07:07 +0000’, '2016-06-17 03:11:05 +0000', >> '2016-06-17 04:03:08 +0000’, 2016-06-17 05:23:03 +0000’, '2016-06-17 07:35:17 +0000', >> '2016-06-17 10:51:10 +0000’, '2016-06-17 14:07:04 +0000’, '2016-06-17 15:11:04 +0000', >> …] >> Very similar phenomenon is observed. >> >> Between the first two measurements in the above lists, there is an interval of more than one hour, which can hardly be explained by measurement secluding issue or temporarily high load. >> What’s more, the probe remained connected at those moments, therefore is free of reboot and power-off.. >> As a reference, probe 12657 has all the measurements coming at due interval within the time window. >> >> What could be the possible causes behind such missing is my doubt. >> And I do appreciate your thinkings on this so that the measurements can be processed and analysed with propre caution. >> >> Thanks. >> >> Regards, >> wenqin >>> On 02 Sep 2016, at 12:57, Robert Kisteleki <robert at ripe.net <mailto:robert at ripe.net>> wrote: >>> >>> On 2016-09-02 12:20, Wenqin SHAO wrote: >>>> Thanks for confirming. The specified frequency is indeed well respected. When there is no data-missing, the interval shift rarely exceed 14s, small compared to 240s the scheduled interval. >>>> What intrigues me is that the exact phase/timing is as well kept after power cut and reboot. >>> >>> The probes have a crontab-like mechanism to remember what they need to do. >>> As long as their clock is more or less ok, they will stick to the >>> pre-allocated times and tasks. >>> >>>> By the way, can a measurement be as well skipped, as designed behaviour, due to scheduling issues mentioned by @Cristel? >>> >>> We're trying to avoid overloading probes, but not everything is under our >>> full control. Some measurements can pile up; Cristel & Randy & co. had a >>> paper about the observed (worst-case) behaviour. >>> >>> Regards, >>> Robert >>> >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: </ripe/mail/archives/ripe-atlas/attachments/20161020/4f211562/attachment.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: example.csv Type: text/csv Size: 72674 bytes Desc: not available URL: </ripe/mail/archives/ripe-atlas/attachments/20161020/4f211562/attachment.csv> -------------- next part -------------- An HTML attachment was scrubbed... URL: </ripe/mail/archives/ripe-atlas/attachments/20161020/4f211562/attachment-0001.html>
- Previous message (by thread): [atlas] Actual measurement interval much larger than planned
- Next message (by thread): [atlas] Out of date system probe tags
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]