Re: langauges, locales, regex, LIKE

From: Richard Huxton <dev(at)archonet(dot)com>
To: Dennis Gearon <gearond(at)fireserve(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: langauges, locales, regex, LIKE
Date: 2004-06-24 07:11:44
Message-ID: 40DA7EB0.1030300@archonet.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Dennis Gearon wrote:
> If I've read everything right, in order to get:
>
> multiple languages on a site
>
> with the functionality of ALL of:
>
> REGEX
> LIKE
> Correctly sorted text
>
> A site would have to:
>
> create a cluster for every language needed
> run a separate database instance for every language
> and have the database instances each have their own port
> and use 8 bit encoding for that specific language

You'd need a separate database, not a separate cluster. Each database
can then have their own encoding and locale.

> because:
>
> Sorting is fixed at cluster/directory creation per single
> database instance

To clarify, a cluster is a group of databases that share user logins and
can all be accessed via the same server.

> And LIKE only works on C Locale with an eight bit encoding
> and sorting (MAYBE?) works only on 8 bit encoding
> when using C Locale.

You can sort, and I believe use LIKE on UTF etc. However, index use is a
different matter.

> If anyone can correct me on this, I'd love to hear it.
>
> Boy, the old LOCALE system has really got to go someday.

The issue isn't so much the difficulty of supporting multiple locales
(AFAIK). I believe it's more to do with interactions. If you have a
table containing multiple languages in the same column, what does it
mean to sort that table? Do you sort by language-name then by languages?
If you don't, what rules do you follow?

What happens if we compare different languages?
Does fr/fr:"a" == en/gb:"a"?
Does en/gb:"hello" == en/us:"hello"?

Messy, isn't it?

--
Richard Huxton
Archonet Ltd

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Dann Corbit 2004-06-24 07:26:30 Re: Performance
Previous Message Carlos Ojea Castro 2004-06-24 07:08:44 Re: psql