This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/routing-wg@ripe.net/

[routing-wg] Issue affecting rsync RPKI repository fetching

Previous message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching
Next message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

George Michaelson ggm at algebras.org
Tue Apr 13 06:40:34 CEST 2021

Not to detract from the paper Randy posted,  in any way.

For APRICOT 2021 I reported to the APNIC routing security sig as follows:

As of Feb 2021
•1,009 total ASNs reading the APNIC RPKI data every day
–902 distinct ASNs collecting using RRDP protocol (https)
–927 distinct ASNs collecting via rsync

--

So for us, they are mostly co-incident sets of ASNs. For APNIC's
publication point, the scale of the issue is "almost everybody does
both"

Whats the size of the problem?

The size of the problem is the likelihood of updating rsync state
whilst its being fetched. There is a simple model to avoid this which
has been noted: modify discrete directories, swing a symlink so the
change from "old" to "new" state is as atomic as possible. rsync on
UNIX is chroot() and the existing bound clients will drain from the
old state. Then run some cleanup process.

But, if you are caught behind a process which writes to the actively
served rsync directory, the "size" of the subdirectory structure is an
indication of the risk of a Manifest being written to, whilst being
fetched. Yes, in absolute terms it could happen to a 1 ROA manifest,
but it is more likely to happen in any manifest of size The "cost" of
a non-atomic upate is higher, and I believe the risk is higher. The
risk is computing the checksums and writing the Manifest and signing
it, while some asynchronous update is happening, and whilst people are
fetching the state.

RIR host thousands of children, so we have at least one manifest each
which is significantly larger over those certificated products and
more likely to trigger a problem.

ggm at host6 repository % !fi
find . -type d | xargs -n 1 -I {} echo "ls {} | wc -l" | sh | sort | uniq -c
   1        2
   1        3
   1        7
   1        8
   1        9
   1       52
   1      147
   1     3352
ggm at host6 repository %

Our hosted solution has this structure. Most children by far have less
than 10 objects.

ggm at host6 member_repository % find . -type d | xargs -n 1 -I {} echo
"ls {} | wc -l" | sh | sort | uniq -c
2997        1
1697        2
2099        3
 560        4
 229        5
  96        6
  44        7
  22        8
  17        9
  11       10
   6       11
   5       12
   5       13
   6       14
   2       15
   4       16
   2       17
   2       18
   1       20
   1       23
   1       25
   1       27
   1       28
   1       29
   1       34
   1       38
   1       40
   3       42
   1       46
   1       60
   1       97
   1      848
ggm at host6 member_repository %

Previous message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching
Next message (by thread): [routing-wg] Issue affecting rsync RPKI repository fetching

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[ routing-wg Archives ]