This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/[email protected]/
[atlas] Probe uptime?
- Previous message (by thread): [atlas] Probe uptime?
- Next message (by thread): [atlas] API (v1) querying probes by ID
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Viktor Naumov
vnaumov at ripe.net
Mon Feb 1 12:04:53 CET 2016
Hi Max, ctr-nue19 was rebooted around 11 o'clock UTC because it was not responding. The logic is following: If a controller doesn't send heartbeats for more than 2 hours it is excluded from the list of available controllers. The algorithm that assigns a probe to a controller tries to assign probes the controller it was connected last time. So your case your probe needed to migrate to another controller (ctr-ams07) after 2 hours of trying to connect to the last controller. It is not necessarily a downtime. It is stated here int the report as the disconnection time. During that time your probes does measurements but it doesn't sent results to controller. We are introducing the probe uptime metrics, but it is still in the making and not exposed to users yet and not included into the billing. WBR /vty On 2/1/16 11:29 AM, Max Mühlbronner wrote: > > Even worse: > > > Internet Address Controller Connected (UTC) Connected > for Disconnected (UTC) Disconnected for > 1xx.xxx.xxx.xx ctr-ams07, NL 2016-01-31 08:48:48 1d 1h > 38m Still Connected > 1xx.xxx.xxx.xx ctr-nue19, DE 2016-01-29 05:07:31 2d 1h > 24m 2016-01-31 06:32:02 2h 16m > > > Could it be a problem of the ripe "controllers"? E.g.if they are > offline my probe also has a downtime , how fast do they failover to > another Controller?? (Why was it offline for 2hours, until it switched > to another one) > > > Best Regards > > Max M. > > On 01.02.2016 09:50, Max Mühlbronner wrote: >> Hi, >> >> i suspect that there is a problem with my atlas probe, sometimes it >> is just gone / offline for ~10 minutes (more or less) although there >> is clearly no issue on the network. Other checks and monitoring works >> fine at the same time and shows about 100% uptime... >> >> I am not sure how to investigate, is there any chance to check the >> uptime of the probe and not just connection time? Maybe the device is >> hanging or rebooting? >> >> >> Internet Address Controller Connected (UTC) Connected >> for Disconnected (UTC) Disconnected for >> xxx.xxx.xxx.xx ctr-nue19, DE 2016-01-06 17:03:27 5d 22h >> 40m 2016-01-12 15:44:09 0h 5m >> xxx.xxx.xxx.xx ctr-nue19, DE 2016-01-01 00:29:44 5d 16h >> 16m 2016-01-06 16:46:33 0h 16m >> xxx.xxx.xxx.xx ctr-nue19, DE 2015-12-31 23:47:35 0h 38m >> 2016-01-01 00:25:40 0h 4m >> xxx.xxx.xxx.xx ctr-nue19, DE 2015-12-31 22:59:18 0h 41m >> 2015-12-31 23:40:32 0h 7m >> xxx.xxx.xxx.xx ctr-nue19, DE 2015-12-31 22:35:12 0h 20m >> 2015-12-31 22:56:08 0h 3m >> >> >> Best Regards >> >> >> Max M. >> > > >
- Previous message (by thread): [atlas] Probe uptime?
- Next message (by thread): [atlas] API (v1) querying probes by ID
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]