This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/db-wg@ripe.net/
[db-wg] Proposal to allow non-ASCII characters in "org-name:", "person:" and "role:" attributes
- Previous message (by thread): [db-wg] Proposal to allow non-ASCII characters in "org-name:", "person:" and "role:" attributes
- Next message (by thread): [db-wg] Whois Release 1.109
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Ángel González Berdasco
angel.gonzalez at incibe.es
Mon Dec 18 03:09:19 CET 2023
It's nice to see you here, Marco. I don't think it will be possible to change the default without potentially causing problems for clients not expecting that. Adding that -C option to the server seems appropriate, and unless you expect that at one point there will be no clients not providing that, you will face the issue that if the default changes from iso-8859-1 to utf-8, old clients will be wrong, as you can't get every client to be updated at the same time. The most similar situation would be if you had a predefined epoch at which both server and clients would change the encoding. But it still requires all clients to have been updated, and the date having been agreed far in advance (and not posponed later!). A situation that could help would be if the server marked the encoding so that the client could recognise the output is no longer iso-8859-1, but utf-8, such as including a BOM (although it may be preferable to include that inside a comment rather than at the beginning, I see benefits both ways), and thus old-but-not-ancient clients could autodetect the switch. Most likely, the server default will not change, though. Best regards -------- Original message -------- From: Marco d'Itri via db-wg <db-wg at ripe.net> Date: 12/18/23 00:44 (GMT+01:00) To: Piotr Strzyzewski <piotr at internetsailor.net> Cc: db-wg <db-wg at ripe.net> Subject: Re: [db-wg] Proposal to allow non-ASCII characters in "org-name:", "person:" and "role:" attributes On Dec 03, Piotr Strzyzewski via db-wg <db-wg at ripe.net> wrote: > As the UTF-8 topic was briefly discussed during DB-WG session at RIPE87 > in Rome, I would like to propose moving forward with it. If that means a > topic for first (?) interim meeting, let it be. Let me know please if > this works for you. Thanks in advance. In Rome I talked a bit with Edward about this. Background: I am the author of the whois client used by all Linux distributions. I fully agree that switching to UTF-8 is desirable, but we cannot just change the encoding of port 43 without major side effects. Since version 5.5.4 (december 2019), the client assumes that the output of whois.ripe.net is Latin 1 and then transcodes it to the system encoding. Receiving unexpected UTF-8 would cause mojibake. My suggestion is to add a new query "command line" option to specify the desired encoding (limiting it to either ISO-8859-1 or UTF-8), as supported by other whois servers. -C is the most common choice, but maybe it would be better to use --charset to not waste a single letter option. See https://github.com/rfc1036/whois/blob/next/servers_charset_list . In a few years then it will be much easier to switch the default from Latin 1 to UTF-8. -- ciao, Marco -------------- next part -------------- An HTML attachment was scrubbed... URL: </ripe/mail/archives/db-wg/attachments/20231218/a58dfd30/attachment.html>
- Previous message (by thread): [db-wg] Proposal to allow non-ASCII characters in "org-name:", "person:" and "role:" attributes
- Next message (by thread): [db-wg] Whois Release 1.109
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[ db-wg Archives ]