Re: langauges, locales, regex, LIKE

From: Dennis Gearon <gearond(at)fireserve(dot)net>
To: Richard Huxton <dev(at)archonet(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: langauges, locales, regex, LIKE
Date: 2004-06-24 16:20:35
Message-ID: 40DAFF53.8080703@fireserve.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Richard Huxton wrote:

> Dennis Gearon wrote:
>
>> If I've read everything right, in order to get:
>>
>> multiple languages on a site
>>
>> with the functionality of ALL of:
>> REGEX
>> LIKE
>> Correctly sorted text
>>
>> A site would have to:
>>
>> create a cluster for every language needed
>> run a separate database instance for every language
>> and have the database instances each have their own port
>> and use 8 bit encoding for that specific language
>
>
> You'd need a separate database, not a separate cluster. Each database
> can then have their own encoding and locale.

If I wanted all the languages to be running concurently, I can't switch clusters that the database is connected to on the fly, right? The database stays in the cluster it was started in, right? So, if that's true, then I need separate database instances if I want truly accurate sorting.

>
>> because:
>>
>> Sorting is fixed at cluster/directory creation per single
>> database instance
>
>
> To clarify, a cluster is a group of databases that share user logins and
> can all be accessed via the same server.
>
>> And LIKE only works on C Locale with an eight bit encoding
>> and sorting (MAYBE?) works only on 8 bit encoding
>> when using C Locale.
>
>
> You can sort, and I believe use LIKE on UTF etc. However, index use is a
> different matter.

Yup, there is no facility to declare character sets for indexes.

>
>> If anyone can correct me on this, I'd love to hear it.
>>
>> Boy, the old LOCALE system has really got to go someday.
>
>
> The issue isn't so much the difficulty of supporting multiple locales
> (AFAIK). I believe it's more to do with interactions. If you have a
> table containing multiple languages in the same column, what does it
> mean to sort that table? Do you sort by language-name then by languages?
> If you don't, what rules do you follow?
>
> What happens if we compare different languages?
> Does fr/fr:"a" == en/gb:"a"?
> Does en/gb:"hello" == en/us:"hello"?
>
> Messy, isn't it?
>
Without languge specific characters, they will sort exactly the same.

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Dennis Gearon 2004-06-24 16:23:05 Re: langauges, locales, regex, LIKE
Previous Message Dennis Gearon 2004-06-24 16:06:44 Re: unicode and sorting(at least)