Re: [HACKERS] Sigh, LIKE indexing is *still* broken in foreign locales

From: Giles Lean <giles(at)nemeton(dot)com(dot)au>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Moucha Václav <MouchaV(at)radiomobil(dot)cz>, pgsql-bugs(at)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Sigh, LIKE indexing is *still* broken in foreign locales
Date: 2000-06-08 06:41:59
Message-ID: 2958.960446519@nemeton.com.au
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs pgsql-hackers


On Wed, 07 Jun 2000 22:22:06 -0400 Tom Lane wrote:

> Since '\341' and '\342' are two different accented forms of 'a'
> (if I'm looking at the right character set), this is perhaps not so
> improbable as all that. Evidently the collation rule is that different
> accent forms sort the same unless the strings would otherwise be
> considered equal, in which case an ordering is assigned to them.

I thought that was common, but while I've worked on
internationalisation issues sometimes I'm no linguist.

> So, the rule we thought we had for generating index bounds falls flat,
> and we're back to the same old question: given a proposed prefix string,
> how can we generate bounds that are certain to be considered <= and >=
> all strings starting with that prefix?

To confess ignorance, why does PostgreSQL need to generate such
bounds? Complete string comparisons with a locale aware function such
as strcoll() are safe. Using less than a full string is tricky
indeed, and I'm not sure is possible in general although it might be.

Other problematic cases are likely to include one-to-two collations (
in German, for example) and two-to-one collations (the reverse, but
I've forgotten my example. Anyone?)

Then there are wide characters, including some encodings that are
stateful.

Regards,

Giles

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Matthias Urlichs 2000-06-08 06:53:25 Re: [HACKERS] Sigh, LIKE indexing is *still* broken in foreign locales
Previous Message Tom Lane 2000-06-08 02:22:06 Sigh, LIKE indexing is *still* broken in foreign locales

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias Urlichs 2000-06-08 06:53:25 Re: [HACKERS] Sigh, LIKE indexing is *still* broken in foreign locales
Previous Message Peter Mount 2000-06-08 06:34:22 RE: java settings in emacs for postgres