This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/mat-wg@ripe.net/
[mat-wg] RIPE NCC measurement data retention
- Previous message (by thread): [mat-wg] RIPE NCC measurement data retention
- Next message (by thread): [mat-wg] The role of aggregators in RIPE Atlas
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Alessandro Improta
aimprota at catchpoint.com
Wed Nov 29 10:18:48 CET 2023
Hello everyone, Apologies for the long post in advance! I'm a long time RIS data user, and I have a couple of suggestions related to the RIS data retention topic that Robert presented yesterday. First is about the usefulness of keeping multiple daily snapshots of the peer RIBs of decades ago. I agree that having 3x daily snapshots is useful to take a quick look at routing tables and is very simple to use. However, I would like to point out that it is possible to recreate the RIB of each peer at any time starting from any RIB snapshot and applying the content of the UPDATE files collected by RIS between the RIB snapshot creation and the desired time. For example, if I want to see the RIB status of rrc00 at 04:00UTC, I can take the RIB snapshot taken at midnight, evolve that with all the UPDATE files from midnight to 04:00UTC and enjoy the results. Said that, I think that a possibility to save some data could be to get rid of 2 of the 3 daily snapshots for older months of RIS. RIS could keep the last years' RIBs as they are now, and remove the RIBs taken at 08:00 and 16:00 for anything older than that - keeping the 00:00. Taking into analysis the month of October 2023 for rrc00, RIBs took 38.8GB while UPDATEs took 45GB. Of course, different collectors have different peers and a different traffic of BGP updates being recorded. Still, cutting the RIBs to one third would give a good saving in data. Second, is about compression. I understand that RIS is leveraging on the collecting software to create gz files, but probably it would be worth to consider to switch to some compressing technique able to compress data more - at least for older data. I know RouteViews is using bz2 already, that could be a good choice if the collecting software already handle that. Every MRT reader is capable of handling bz2 files. However, I found xz extremely performing on top of MRT files - even though only a few MRT readers are capable of reading that. As an exercise, I took bview.20231129.0000.gz of rrc00. The size of the file is 406MB, which becomes 4.1GB uncompressed. If I were to bzip2 the uncompressed file, I would have a bview.20231129.0000.bz2 of 242MB. If I were to xz the uncompressed file, I would have a bview.20231129.0000.xz of 160MB. There may be other compression tools that are even more efficient on MRT data out there. I think a little study on the effectiveness of the different compressing technique should be performed before taking any decision - if you want to follow this route. Apologies once again for the long post! Alessandro Best Regards, Alessandro Improta Engineering manager p. +393488077654 e. aimprota at catchpoint.com<mailto:aimprota at catchpoint.com> a. Via Aurelia Sud km 367, Pietrasanta (LU) [cid:71a5b299-4181-4814-9099-2221a198b1a6] Learn more about Catchpoint → Watch this 2-minute video!<https://www.catchpoint.com/explainer> [linkedin]<https://www.linkedin.com/company/catchpoint-systems-inc> [twitter] <https://twitter.com/Catchpoint> [facebook] <https://www.facebook.com/catchpoint/> [youtube] <https://www.youtube.com/c/Catchpoint/> ________________________________ From: mat-wg <mat-wg-bounces at ripe.net> on behalf of Robert Kisteleki <robert at ripe.net> Sent: Wednesday, November 22, 2023 5:43 PM To: Measurement Analysis and Tools Working Group <mat-wg at ripe.net> Subject: [mat-wg] RIPE NCC measurement data retention Dear all, We've just published a proposal about establishing principles around how the RIPE NCC retains and publishes Internet measurement data, specifically in RIS and RIPE Atlas: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flabs.ripe.net%2Fauthor%2Fkistel%2Fripe-ncc-measurement-data-retention-principles%2F&data=05%7C01%7Caimprota%40catchpoint.com%7Cfda177fd8e564570805a08dbeb7a2b4d%7C0c927d7e38e74a3fa4f2e746ec8a0842%7C0%7C0%7C638362682202269553%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=QVoMazXXruSVWHvnmc%2FL62LjuQ2JELeD6sJnSv3Jcm4%3D&reserved=0<https://labs.ripe.net/author/kistel/ripe-ncc-measurement-data-retention-principles/> We would be very happy to see discussions about this here on the mailing list, on the RIPE NCC Forum, or live at RIPE87. Regards, Robert Kisteleki RIPE NCC -- To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.ripe.net%2Fmailman%2Flistinfo%2Fmat-wg&data=05%7C01%7Caimprota%40catchpoint.com%7Cfda177fd8e564570805a08dbeb7a2b4d%7C0c927d7e38e74a3fa4f2e746ec8a0842%7C0%7C0%7C638362682202269553%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7C&sdata=1e%2F%2FNrX30i9rvo11noOIAHm8l9a2Wkd4G8fbQgte3G0%3D&reserved=0<https://mailman.ripe.net/> -------------- next part -------------- An HTML attachment was scrubbed... URL: </ripe/mail/archives/mat-wg/attachments/20231129/13e082a0/attachment-0001.html> -------------- next part -------------- A non-text attachment was scrubbed... Name: Outlook-ppncgowx.png Type: image/png Size: 7788 bytes Desc: Outlook-ppncgowx.png URL: </ripe/mail/archives/mat-wg/attachments/20231129/13e082a0/attachment-0001.png>
- Previous message (by thread): [mat-wg] RIPE NCC measurement data retention
- Next message (by thread): [mat-wg] The role of aggregators in RIPE Atlas
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[ mat-wg Archives ]