RE: Progress report on locale safe LIKE indexing

From: Peter Eisentraut <peter_e(at)gmx(dot)net>
To: Hiroshi Inoue <Inoue(at)tpf(dot)co(dot)jp>
Cc: PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: Progress report on locale safe LIKE indexing
Date: 2001-08-18 22:56:53
Message-ID: Pine.LNX.4.30.0108190040400.677-100000@peter.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hiroshi Inoue writes:

> Please look at my first question.
> This depends on the assumption that '=' is equivalent in
> any locale. Is it guaranteed ?
> For example, ( 'a' = 'A' ) isn't allowed in any locale ?.
>
> And your answer was
> The whole point here is not to rely on '='.
>
> Clearly your theory depends on the assumption that
> If a = b in some locale then a = b in ASCII locale.
>
> And where does 'a' <> 'A' come from ?
> The definition of '=' is a part of collating sequence.
>
> >
> > > LIKE seems to use the collating sequence.
> >
> > No. The collating sequence defines the order of all possible strings.
> > LIKE doesn't order anything.
>
> Again where does it come from ?

Let me elaborate again:

We want to be able to use btree indexes for LIKE expressions, under the
theory that given the expression col LIKE 'foo%' we can augment the
expression col >= 'foo' and col < 'fop', which a btree can handle. Our
problem is that this theory was false, because if the operators >= and <
are locale-aware they can do just about anything. So my solution was that
I implement an extra set of operators >= and < (named $>=$ and $<$ for the
heck of it) that are *not* locale-aware so that this whole thing works
again.

Now, if you look at the code that does the LIKE pattern matching you'll
see that it does not use any locale features, it simply compares
characters for equality based on their character codes, accounting for the
wildcards. Consequentially, this whole operation has nothing to do with
locales. It was an error that it did in the first place, that's why we
had all these problems.

--
Peter Eisentraut peter_e(at)gmx(dot)net http://funkturm.homeip.net/~peter

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2001-08-18 23:30:34 Re: encoding names
Previous Message Hiroshi Inoue 2001-08-18 21:53:06 RE: Progress report on locale safe LIKE indexing