This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/[email protected]/
[db-wg] Decision on NWI-2 Historical queries
- Previous message (by thread): [db-wg] Decision on NWI-2 Historical queries
- Next message (by thread): [db-wg] Decision on NWI-2 Historical queries
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
denis walker
ripedenis at gmail.com
Fri Apr 22 01:31:51 CEST 2022
Colleagues Although I am replying to your email Cynthia, my comments are generally expressed to all. Let me try again to explain history. I still think there is a misunderstanding about what the RIPE Database is and what it means to publish information in this database. This is a public database. It is available globally to anyone who has an internet connection. Once you publish any information in this database it is public data. The full details of that data will be downloaded and copied by many people. This has almost certainly been done for many years if not decades. There may well be some people out there who have more historical data than the RIPE NCC holds. Anyone who is concerned about privacy should not allow their personal data to be published in this database. Once it is published it is too late to worry about privacy. It is already out there, it is public and it has been copied. You cannot take it back. You may have the right to be forgotten, but you don't have the means to be forgotten once you have broadcast your personal details in public. That is the simple reality of the internet. So what data might some people have? I am not going to explain the full technical details of how to do this, for obvious security reasons. But there are at least two ways that anyone can create their own private copy of the whole RIPE Database including full details of every update and including all personal data. It is quite likely that some people have been doing this for the last 20+ years. Even if you haven't done this in the past, you could start doing it today and create a private copy with all future updates. So any object that is deleted tomorrow will still exist in your private database. The RIPE Database has no privacy protection for personal data once it has been published. It is a PUBLIC database accessible by everyone. This is one of the many reasons why the RIPE Database should not contain any personal data. So back to historical queries. By providing this service for operational data going back to the start of the database, without any arbitrary restrictions, we remove the need for most people to go to the trouble of building up their own local, private database with a full copy of the data. By restricting access to past operational data on some arbitrary condition we actually monetise the data that some people already hold. We could create a market for some people to sell that data. Just like IPv4, where there is a demand for something, with few options for supply, some people will make money from it. So again I will say that there is no valid reason to restrict access to full operational data through an official interface that attempts to protect personal information. cheers denis co-chair DB-WG On Sat, 16 Apr 2022 at 18:39, Cynthia Revström <me at cynthia.re> wrote: > > Hi, > > Sorry for the late reply, I didn't see that you had replied. > > The reason that I think the last deletion should be the cut-off point > is that it feels quite natural, you deleted the object, it should no > longer exist. > > Also given how a decent number of individuals now have resources from > the NCC (either directly or through a sponsoring LIR) I feel like we > should consider if organisation objects should really have history. > This feels even more important given how full legal names are required > (afaik), which will then be published. > > This might be a separate discussion but I think it is important to > consider the pros and cons of this. > > -Cynthia > > > On Tue, Apr 5, 2022 at 10:50 AM denis walker <ripedenis at gmail.com> wrote: > > > > Hi Cynthia > > > > I understand your concerns but feel I need to clarify the points you > > made. Most people don't understand the internal workings of the > > database. That is not surprising as it is quite complex. I spent 14 > > years with the RIPE NCC as a holistic engineer and analyst working > > with the database. I know it inside out from all possible angles. I > > wrote the spec for historical queries in 2013. We had no requirements, > > it was just an idea that someone thought might be useful. To get > > something up and running quickly I went for the low hanging fruit. > > Because of the way objects are managed internally within the database, > > stopping at the most recent deletion point for an object history was > > simpler. It was a completely arbitrary, technical constraint. It had > > no privacy reasoning behind it. At the time we released this service > > in 2013, it was reviewed by the legal team and they had not asked for > > this constraint for any reason. (The legal team is currently doing a > > new review of the issue.) > > > > So lets look at exactly what this constraint means. Currently we > > provide some history of operational and corporate data objects. These > > objects are all heavily redacted to remove anything that is considered > > to be personal data. We do not provide any history of objects that are > > considered to be mostly personal or security data. Any object can have > > multiple versions and each version can have multiple instances. An > > object may be created (v1), updated several times, deleted, re-created > > (v2), updated, deleted, re-created (v3) and updated again. The > > arbitrary constraint means only v3 data will be available with all > > it's updates. None of the data for v1 and v2 is made available. By > > dropping this arbitrary constraint all the history of this object, v1, > > v2 and v3, will be available. All the data for v1 and v2 will be > > redacted in the same way as v3 to remove any personal data. There is > > no difference in the data for v1, v2 and v3. It is all operational and > > corporate data with personal data removed. As we have generally > > accepted that it is ok to release the historical, non personal, data > > for v3, it should also be ok to release the historical, non personal, > > data for v1 and v2. If there is any privacy concerns over the v1, v2 > > data, then we would have exactly the same privacy concerns over the v3 > > data that we currently provide. > > > > This technical change does not change the object types that we allow > > historical data to be queried for. Nor does it change any of the > > attribute types within those objects that we redact. We simply provide > > a complete set rather than a partial set of the operational and > > corporate data we currently offer. > > > > Lets look at some examples of what this means. Many objects have never > > been deleted. Some of these have been around since this version of the > > RIPE Database model was released in 2001. If you query for the history > > of these objects you will get the full 21 years of history. Many > > assignments are given out to end users, then deleted and re-assigned > > to another end user. This can happen multiple times. This prefix will > > have many versions v1...vn. Only the history of vn is currently > > available, and only if this version still exists in the 'live' > > database. For many researchers of address space usage or routing > > issues and abuse investigators and brokers, the history beyond vn is > > useful data. Also when a resource is transferred or consolidated the > > allocation object is deleted and re-created by the RIPE NCC. So for an > > object which last week you could see 21 years of history, which has > > just been transferred, you will only see 1 week of history now. There > > is also occasionally a case of an operator who accidentally deletes > > the wrong assignment and does not have an up-to-date copy of the > > object as it was just before it was deleted. They currently have to > > ask the RIPE NCC to supply them with the most recent version. Without > > this arbitrary constraint they could just recreate it, then look up > > the history themselves. > > > > There is also the issue of NRTM. Some users have been operating an > > NRTM stream for years, even decades. They have the un-redacted > > versions of the history of all the resource objects since they started > > streaming the updates. Redacting personal data only started a few > > years ago. In many cases they have the full history, regardless of how > > many times the object has been deleted and re-created. Anyone who > > starts an NRTM stream now will start to build up their own history of > > redacted, operational data objects that will remain in their own > > database after objects are deleted and re-created. That is an accepted > > practice. > > > > This constraint really is totally arbitrary. It is rooted in technical > > expedience in getting the service started. It has nothing to do with > > privacy. There really is no good reason to keep this arbitrary > > constraint. Either we provide historical operational data or we don't. > > Offering a partial data set based on random events makes no sense. > > > > cheers > > denis > > co-chair DB-WG > > > > On Mon, 4 Apr 2022 at 23:33, Cynthia Revström <me at cynthia.re> wrote: > > > > > > Hi, > > > > > > I am sorry for the delayed response but I object to removing that constraint. > > > > > > It feels problematic to me from a privacy perspective, and it feels > > > like the last deletion point is a fair balance between providing > > > useful info and not providing too much info. > > > > > > I don't think this restriction should be removed, at least not without > > > a very good reason to do so. > > > > > > -Cynthia > > > > > > On Thu, Mar 31, 2022 at 1:32 PM denis walker via db-wg <db-wg at ripe.net> wrote: > > > > > > > > Colleagues > > > > > > > > When I wrote the spec for historical queries, almost 10 years ago, I > > > > included an arbitrary constraint to only show history back to the most > > > > recent deletion point. This was to get something in production quickly > > > > and see how useful it was. Over the years several people have asked > > > > for this arbitrary constraint to be removed. No one has objected to > > > > removing it. The co-chairs therefore recommend that this arbitrary > > > > constraint be removed. The co-chairs now ask the RIPE NCC to produce > > > > an impact analysis and implementation plan to remove it. We will then > > > > seek a final approval from the community on the plan. > > > > > > > > cheers > > > > denis > > > > co-chair DB-WG > > > > > > > > -- > > > > > > > > To unsubscribe from this mailing list, get a password reminder, or change your subscription options, please visit: https://mailman.ripe.net/
- Previous message (by thread): [db-wg] Decision on NWI-2 Historical queries
- Next message (by thread): [db-wg] Decision on NWI-2 Historical queries
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[ db-wg Archives ]