This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/db-wg@ripe.net/

[db-wg] NRTM replication inefficiencies

Previous message (by thread): [db-wg] NRTM replication inefficiencies
Next message (by thread): [db-wg] NRTM replication inefficiencies

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Tim Bruijnzeels tim at ripe.net
Thu Nov 9 16:28:03 CET 2017

Hi Job, WG,

> On 7 Nov 2017, at 23:11, Job Snijders via db-wg <db-wg at ripe.net> wrote:
> 
> I would also welcome an investigation into alternative approaches, (some
> not-via-WHOIS replication mechanisms), perhaps something over HTTPS can
> be done? Either way, something more robust would be useful.

We recently developed and implement a standard for something similar for RPKI:
https://datatracker.ietf.org/doc/rfc8182/

I believe this approach can be useful here as well. Without going into all the RPKI specifics, it works a little something like this:

Starting points:
= The state of the rpki repository (or whois) at a given point in time can represented by a ‘snapshot’
    - This snapshot is “immutable” - therefore they may be cached indefinitely and we can give it a unique URL and deliver it through a distributed CDN
= The delta between two consecutive snapshots is also “immutable” data - so again we can cache it and give it a unique URL and distribute
= We can publish a notification file (which should NOT be cached) that points to:
   - the CURRENT snapshot
   - a list of deltas (each for 1 increment) - total size of deltas MUST not exceed size of snapshot

Clients can then just poll the notification file and work out for themselves whether a list of deltas is available to them, or that they need to get the latest snapshot instead.

Yes, we use a session_id and hashes of referenced files for additional checks (details in the RFC).

The idea behind this design was that we wanted to minimise the impact on the server. In a chatty protocol (like rsync which is still used in RPKI) the server and client need to work out their differences together to determine what needs to be transferred. This is fine in one on one relations, but when a server needs to serve a multitude of clients this doesn’t scale. We want to be able parallelise as much as we can (Amdahl’s law), so we push the computational burden to the clients. The server just needs a one-off investment to create the snapshot and delta and latest notification which it can then offload. Using HTTPS allows us to leverage one of the many, many CDNs out there. This problem has been solved in the industry. So we do not need to invent our own infrastructure for this.

Note that in the case of RPKI the protocol is XML based. This made sense because it leveraged existing definitions in the RPKI space that were also XML based. For whois it may make more sense to look at JSON and/or RDAP.

Please let me know if you see merit in this kind of ‘delta’ protocol in the whois space.

Kind regards

Tim Bruijnzeels
Assistant Manager Software Engineering and Senior Technology Officer
RIPE NCC

Previous message (by thread): [db-wg] NRTM replication inefficiencies
Next message (by thread): [db-wg] NRTM replication inefficiencies

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[ db-wg Archives ]