This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/db-wg@ripe.net/
[db-wg] Puny code or UTF-8 (or both)?
- Previous message (by thread): [db-wg] Puny code or UTF-8 (or both)?
- Next message (by thread): [db-wg] Puny code or UTF-8 (or both)?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Tony Finch
dot at dotat.at
Mon Jul 13 16:43:35 CEST 2020
Peter Koch via db-wg <db-wg at ripe.net> wrote: > > I'm not sure I understand the proposal. Me too :-) > "punycode" is primarily IDNA2003 speak AFAIK IDNA2008 uses punycode in exactly the same way as IDNA2003. One of the major changes was to get rid of stringprep. > How would that system deal with conversion failures and/or with > ambiguities between IDNA2003 and IDNA2008? My understanding is that we want to support Unicode for lots of fields in the database, and the suggestion is that it might be easier to jam punycode into the existing ISO 8859-1 fields. I think this will be difficult if the database is going to use punycode for fields that aren't domain names or email addresses, and that don't have standard encoding rules. In particular I wonder how to handle spaces and upper/lower case. It might be easier to use base64 than punycode (but actually I think that's a terrible idea). There's also George Michaelson's point that the database should have both the original form of the field as well as a latin transliteration if necessary. And this is necessary regardless of how the original form is encoded (UTF-8, punycode, whatever). So I think it might be worth adding support for transcoding to/from punycode domain names and email addresses without waiting for full UTF-8 support, because that's likely to be useful in the long term. (Maybe something like the DENIC `-T ace` whois option?) But for other fields I doubt there is a stop-gap that will be easy and useful and not enormously regrettable in the future. Tony. -- f.anthony.n.finch <dot at dotat.at> http://dotat.at/ South Fitzroy: Northerly 5 to 7. Moderate or rough, becoming slight or moderate in northeast. Mainly fair. Good.
- Previous message (by thread): [db-wg] Puny code or UTF-8 (or both)?
- Next message (by thread): [db-wg] Puny code or UTF-8 (or both)?
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
[ db-wg Archives ]