This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/db-wg@ripe.net/

[db-wg] Puny code or UTF-8 (or both)?

Previous message (by thread): [db-wg] Puny code or UTF-8 (or both)?
Next message (by thread): [db-wg] Puny code or UTF-8 (or both)?

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Tony Finch dot at dotat.at
Mon Jul 13 16:43:35 CEST 2020

Peter Koch via db-wg <db-wg at ripe.net> wrote:
>
> I'm not sure I understand the proposal.

Me too :-)

> "punycode" is primarily IDNA2003 speak

AFAIK IDNA2008 uses punycode in exactly the same way as IDNA2003.
One of the major changes was to get rid of stringprep.

> How would that system deal with conversion failures and/or with
> ambiguities between IDNA2003 and IDNA2008?

My understanding is that we want to support Unicode for lots of fields
in the database, and the suggestion is that it might be easier to jam
punycode into the existing ISO 8859-1 fields.

I think this will be difficult if the database is going to use punycode
for fields that aren't domain names or email addresses, and that don't
have standard encoding rules. In particular I wonder how to handle spaces
and upper/lower case. It might be easier to use base64 than punycode (but
actually I think that's a terrible idea).

There's also George Michaelson's point that the database should have both
the original form of the field as well as a latin transliteration if
necessary. And this is necessary regardless of how the original form is
encoded (UTF-8, punycode, whatever).

So I think it might be worth adding support for transcoding to/from
punycode domain names and email addresses without waiting for full UTF-8
support, because that's likely to be useful in the long term. (Maybe
something like the DENIC `-T ace` whois option?) But for other fields I
doubt there is a stop-gap that will be easy and useful and not enormously
regrettable in the future.

Tony.
-- 
f.anthony.n.finch  <dot at dotat.at>  http://dotat.at/
South Fitzroy: Northerly 5 to 7. Moderate or rough, becoming slight or
moderate in northeast. Mainly fair. Good.

Previous message (by thread): [db-wg] Puny code or UTF-8 (or both)?
Next message (by thread): [db-wg] Puny code or UTF-8 (or both)?

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[ db-wg Archives ]