This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/routing-wg@ripe.net/
[routing-wg] Issue affecting rsync RPKI repository fetching
- Previous message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching
- Next message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
George Michaelson
ggm at algebras.org
Tue Apr 13 06:40:34 CEST 2021
Not to detract from the paper Randy posted, in any way. For APRICOT 2021 I reported to the APNIC routing security sig as follows: As of Feb 2021 •1,009 total ASNs reading the APNIC RPKI data every day –902 distinct ASNs collecting using RRDP protocol (https) –927 distinct ASNs collecting via rsync -- So for us, they are mostly co-incident sets of ASNs. For APNIC's publication point, the scale of the issue is "almost everybody does both" Whats the size of the problem? The size of the problem is the likelihood of updating rsync state whilst its being fetched. There is a simple model to avoid this which has been noted: modify discrete directories, swing a symlink so the change from "old" to "new" state is as atomic as possible. rsync on UNIX is chroot() and the existing bound clients will drain from the old state. Then run some cleanup process. But, if you are caught behind a process which writes to the actively served rsync directory, the "size" of the subdirectory structure is an indication of the risk of a Manifest being written to, whilst being fetched. Yes, in absolute terms it could happen to a 1 ROA manifest, but it is more likely to happen in any manifest of size The "cost" of a non-atomic upate is higher, and I believe the risk is higher. The risk is computing the checksums and writing the Manifest and signing it, while some asynchronous update is happening, and whilst people are fetching the state. RIR host thousands of children, so we have at least one manifest each which is significantly larger over those certificated products and more likely to trigger a problem. ggm at host6 repository % !fi find . -type d | xargs -n 1 -I {} echo "ls {} | wc -l" | sh | sort | uniq -c 1 2 1 3 1 7 1 8 1 9 1 52 1 147 1 3352 ggm at host6 repository % Our hosted solution has this structure. Most children by far have less than 10 objects. ggm at host6 member_repository % find . -type d | xargs -n 1 -I {} echo "ls {} | wc -l" | sh | sort | uniq -c 2997 1 1697 2 2099 3 560 4 229 5 96 6 44 7 22 8 17 9 11 10 6 11 5 12 5 13 6 14 2 15 4 16 2 17 2 18 1 20 1 23 1 25 1 27 1 28 1 29 1 34 1 38 1 40 3 42 1 46 1 60 1 97 1 848 ggm at host6 member_repository %
- Previous message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching
- Next message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]