This archive is retained to ensure existing URLs remain functional. It will not contain any emails sent to this mailing list after July 1, 2024. For all messages, including those sent before and after this date, please visit the new location of the archive at https://mailman.ripe.net/archives/list/db-wg@ripe.net/

[db-wg] UTF8

Previous message (by thread): [db-wg] UTF8
Next message (by thread): [db-wg] UTF8

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

Piotr Strzyzewski Piotr.Strzyzewski at polsl.pl
Wed May 6 14:07:56 CEST 2015

On Wed, May 06, 2015 at 09:13:28AM +0000, denis walker wrote:

Dear Denis

> Thanks for the clarification. I don't think it makes sense to restrict
> the UTF8 to only character sets defined within the RIPE region. (Not
> sure it is even technically possible.) But if a Chinese person lives
> and works in this region why would they not be able to enter their

This idea came from the fact that if someone live in this region,
probably have some documents issued by local authorities.
Of course for some cases this could not be true. So, good point.

> correct name? Just for arguments sake, changing my name into Chinese
> with Google translate changes the space to a '.'. If that is correct
> then the current syntax check fails.

Well spotted.

> Also "person:", "role:" and "org-name:" are all defined as 'lookup
> keys'. That means you can enter their values in a query as the query
> string and that will be searched on in the database. The individual

This could introduce some inconveniences while using cli interface.

> 'words' from these attribute values are stored in index tables in the
> database and searched as part of the query to return objects with
> matching values. I believe it is problematic to do string comparison
> in UTF8.

I really doubt. Have you used Google search recently? ;-)

Being more serious, I believe that most of the countries with their own
alphabets do use internet tools and webpages without translating all
the names, addresses and other things to US-ASCII or Latin1.

> Also the Full Text Search allows searches on all these attributes as
> well as "address:", "descr:" and "remarks:". Again all the component
> parts of these values are indexed for this search.

> So to allow any attribute in UTF8 only, may require software changes
> and may put restrictions on some of the services the database
> currently provides. If you cannot rely on a search returning the
> correct objects then you cannot allow those searches.

I'm aware that any modification may require software changes.
I hope that you haven't suggested that we should abandon any
improvements just because it requires some work to do.

> There was a Labs article written some time ago on
> UTF8https://labs.ripe.net/Members/kranjbar/internationalisation-of-ripe-database

> This article put forward the idea of keeping all existing attributes
> in ASCII (but really meant Latin1) and allowing additional optional
> attributes for name and contact details in local language. I think
> that would be a good first step to provide additional benefits of
> localisation without breaking any of the current functionality. Even
> if it was only an interim step it would allow time to asses any issues
> and monitor the usefulness of these new attributes.

It was back in 2010 during the RIPE61 when I propose person-idn: and
other similar attributes. Although I understand your point of view, I
believe that the situation has changed through years.

Best regards,
Piotr

-- 
gucio -> Piotr Strzyżewski
E-mail: Piotr.Strzyzewski at polsl.pl

Previous message (by thread): [db-wg] UTF8
Next message (by thread): [db-wg] UTF8

Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

[ db-wg Archives ]