This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/[email protected]/
[atlas] Actual measurement interval much larger than planned
- Previous message (by thread): [atlas] Actual measurement interval much larger than planned
- Next message (by thread): [atlas] Actual measurement interval much larger than planned
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Andreas Strikos
astrikos at ripe.net
Wed Oct 19 14:28:21 CEST 2016
Hi Wenqin, that was a weird case where specific probe has time synchronization issues. We saw a lot of debug messages in our raw logs coming from probe complaining about time sync issues. If you have more examples including other probes please send us details and we will happily check it for you. Regards, Andreas On 16/10/16 22:23, Wenqin SHAO wrote: > Hi list, > > A couple of weeks ago, I asked in this thread why some built-in > measurements are missing, or not performed at scheduled interval. > Cristel and Robert very kindly shared what they thought could be the > causes: scheduling, probe reboot, updating task list, etc. > > The discussion gave me the idea to verify if there are as well missing > measurements while the probe is powered and connected to an Atlas > controller, i.e. probe seemly works in a good condition. > > Here, I would like to share one case among many I’ve observed where > built-in measurements are missed continuously for a long time, even > when the probe is well connected to a controller. > > Let’s look at a time window from '2016-06-16 21:53:20 +0000’ to > '2016-06-18 20:16:40 +0000’ for probe 22144. > > First, I queried its connection events to Atlas controller (msm_id 7000). > The result says the probe connected to a controller at '2016-06-16 > 21:54:19 +0000’ and became disconnected at '2016-06-18 20:13:48 > +0000’. Between these two moments, the probe is supposed to remain > connected, and thus continuously powered. > > Then, I queried the built-in ping measurements toward b-root (msm_id > 1010) within the time window. > Here below the timestamps at which measurements are performed. > [‘2016-06-16 22:51:08 +0000’, '2016-06-17 02:07:08 +0000’, '2016-06-17 > 03:11:02 +0000', > '2016-06-17 04:03:07 +0000’, '2016-06-17 05:23:03 +0000’, '2016-06-17 > 07:35:17 +0000', > '2016-06-17 10:51:06 +0000’, '2016-06-17 14:07:04 +0000’, '2016-06-17 > 15:11:03 +0000’, > '2016-06-17 17:23:06 +0000’, '2016-06-17 18:27:04 +0000’, '2016-06-17 > 20:39:04 +0000’, > '2016-06-17 22:51:09 +0000’, '2016-06-17 23:55:03 +0000’, '2016-06-18 > 02:07:08 +0000', > '2016-06-18 03:11:04 +0000’, '2016-06-18 06:27:05 +0000’, '2016-06-18 > 08:39:02 +0000', > '2016-06-18 10:27:13 +0000’, '2016-06-18 10:51:08 +0000’, '2016-06-18 > 11:55:13 +0000', > '2016-06-18 12:03:13 +0000’, '2016-06-18 15:11:04 +0000’, '2016-06-18 > 17:23:05 +0000', > '2016-06-18 18:27:09 +0000’, '2016-06-18 18:43:13 +0000’, '2016-06-18 > 19:35:18 +0000', > '2016-06-18 20:15:06 +0000’] > We can see the intervals between neighbouring measurements are much > larger than the planned value 240sec. > > I investigated as well other built-in ping measurements, say toward > k-root (msm_id 1001). > Here below are the timestamps: > ['2016-06-16 22:51:08 +0000’, '2016-06-17 02:07:07 +0000’, '2016-06-17 > 03:11:05 +0000', > '2016-06-17 04:03:08 +0000’, 2016-06-17 05:23:03 +0000’, '2016-06-17 > 07:35:17 +0000', > '2016-06-17 10:51:10 +0000’, '2016-06-17 14:07:04 +0000’, '2016-06-17 > 15:11:04 +0000', > …] > Very similar phenomenon is observed. > > Between the first two measurements in the above lists, there is an > interval of more than one hour, which can hardly be explained by > measurement secluding issue or temporarily high load. > What’s more, the probe remained connected at those moments, therefore > is free of reboot and power-off.. > As a reference, probe 12657 has all the measurements coming at due > interval within the time window. > > What could be the possible causes behind such missing is my doubt. > And I do appreciate your thinkings on this so that the measurements > can be processed and analysed with propre caution. > > Thanks. > > Regards, > wenqin >> On 02 Sep 2016, at 12:57, Robert Kisteleki <robert at ripe.net >> <mailto:robert at ripe.net>> wrote: >> >> On 2016-09-02 12:20, Wenqin SHAO wrote: >>> Thanks for confirming. The specified frequency is indeed well >>> respected. When there is no data-missing, the interval shift rarely >>> exceed 14s, small compared to 240s the scheduled interval. >>> What intrigues me is that the exact phase/timing is as well kept >>> after power cut and reboot. >> >> The probes have a crontab-like mechanism to remember what they need >> to do. >> As long as their clock is more or less ok, they will stick to the >> pre-allocated times and tasks. >> >>> By the way, can a measurement be as well skipped, as designed >>> behaviour, due to scheduling issues mentioned by @Cristel? >> >> We're trying to avoid overloading probes, but not everything is under our >> full control. Some measurements can pile up; Cristel & Randy & co. had a >> paper about the observed (worst-case) behaviour. >> >> Regards, >> Robert >> > -------------- next part -------------- An HTML attachment was scrubbed... URL: </ripe/mail/archives/ripe-atlas/attachments/20161019/c94f2fb9/attachment.html>
- Previous message (by thread): [atlas] Actual measurement interval much larger than planned
- Next message (by thread): [atlas] Actual measurement interval much larger than planned
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]