<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body dir="auto">
It's nice to see you here, Marco.
<div dir="auto"><br>
</div>
<div dir="auto">I don't think it will be possible to change the default without potentially causing problems for clients not expecting that. Adding that -C option to the server seems appropriate, and unless you expect that at one point there will be no clients
not providing that, you will face the issue that if the default changes from iso-8859-1 to utf-8, old clients will be wrong, as you can't get every client to be updated at the same time. The most similar situation would be if you had a predefined epoch at
which both server and clients would change the encoding. But it still requires all clients to have been updated, and the date having been agreed far in advance (and not posponed later!).</div>
<div dir="auto">A situation that could help would be if the server marked the encoding so that the client could recognise the output is no longer iso-8859-1, but utf-8, such as including a BOM (although it may be preferable to include that
<i>inside</i> a comment rather than at the beginning, I see benefits both ways), and thus old-but-not-ancient clients could autodetect the switch.</div>
<div dir="auto">Most likely, the server default will not change, though.</div>
<div dir="auto"><br>
</div>
<div dir="auto"><br>
</div>
<div dir="auto">Best regards</div>
<div><br>
</div>
<div align="left" dir="auto" style="font-size:100%;color:#000000">
<div>-------- Original message --------</div>
<div>From: Marco d'Itri via db-wg <db-wg@ripe.net> </div>
<div>Date: 12/18/23 00:44 (GMT+01:00) </div>
<div>To: Piotr Strzyzewski <piotr@internetsailor.net> </div>
<div>Cc: db-wg <db-wg@ripe.net> </div>
<div>Subject: Re: [db-wg] Proposal to allow non-ASCII characters in "org-name:", "person:" and "role:" attributes</div>
</div>
<font size="2" dir="auto"><span style="font-size:10pt;">
<div class="PlainText"><br>
On Dec 03, Piotr Strzyzewski via db-wg <db-wg@ripe.net> wrote:<br>
<br>
> As the UTF-8 topic was briefly discussed during DB-WG session at RIPE87<br>
> in Rome, I would like to propose moving forward with it. If that means a<br>
> topic for first (?) interim meeting, let it be. Let me know please if<br>
> this works for you. Thanks in advance.<br>
In Rome I talked a bit with Edward about this.<br>
Background: I am the author of the whois client used by all Linux<br>
distributions.<br>
<br>
I fully agree that switching to UTF-8 is desirable, but we cannot just<br>
change the encoding of port 43 without major side effects.<br>
Since version 5.5.4 (december 2019), the client assumes that the output<br>
of whois.ripe.net is Latin 1 and then transcodes it to the system<br>
encoding.<br>
Receiving unexpected UTF-8 would cause mojibake.<br>
<br>
My suggestion is to add a new query "command line" option to specify the<br>
desired encoding (limiting it to either ISO-8859-1 or UTF-8), as<br>
supported by other whois servers.<br>
-C is the most common choice, but maybe it would be better to use<br>
--charset to not waste a single letter option.<br>
See <a href="https://github.com/rfc1036/whois/blob/next/servers_charset_list">https://github.com/rfc1036/whois/blob/next/servers_charset_list</a> .<br>
<br>
In a few years then it will be much easier to switch the default from<br>
Latin 1 to UTF-8.<br>
<br>
--<br>
ciao,<br>
Marco<br>
</div>
</span></font>
</body>
</html>