SV: Bug in index scans with Locale support enabled

From: "Jarmo Paavilainen" <netletter(at)comder(dot)com>
To: "PostgreSQL General" <pgsql-general(at)postgresql(dot)org>
Subject: SV: Bug in index scans with Locale support enabled
Date: 2000-12-09 10:28:35
Message-ID: 00d701c061ca$c93d4de0$1501a8c0@theboss.comder.private
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general pgsql-hackers

Hi,

...
> In researching a problem I have uncovered the following bug in index
> scans when Locale support is enabled.
...
> environment variable is set to en_US) to enable the US english locale
...
> create table test (test_col text);
> create index test_index on test (test_col);
> insert into test values ('abc.xyz');
> insert into test values ('abcxyz');
> insert into test values ('abc/xyz');
>
> If you run the query:
>
> select * from test where test_col >= 'abc.';
>
> One would normally expect to only get one record returned, but instead
> all records are returned.

I would expect all to be returned (maybe not "abc/..."). Because noice
should be sorted first. ie. '.' is less than '0' and 'x' (and maybe '/').

...
> The reason for this is that in the en_US locale all non-alphanumeric
> characters are ignored when doing string comparisons. So the data above

...or... *I think* they are sorted first. If that is correct in your locale,
I do not know.

...
> Note that if you use a different locale for example en_UK, you will get

Thats odd, I would expect en_UK and en_US to sort the same way (same
charset).

...
> select * from text where test_col like 'abc/%';
>
> This query should return one row, the row for 'abc/xyz'. However if the
> above query is executed via an index scan it will return the wrong
> number of rows (0 in this case).

ehh index scan? test_col >= 'abc/' or test_col >= 'abc/%' ????
The first one should return all rows but the one with '.', while the second
should return 0 rows. If the first one returns zero rows, then its a bug.

If you meant what the optimizer does with LIKE, well *I think* such
optimazion is asking for trouble (compare strings with anything else than =
and != are, well hard to predict).

...
> "like '/aaa/bbb/%' don't work. From the above information I have
> determined that such queries don't work if:
> a) database is built with Locale support enabled (--enable-locale)

Actually they should not work without '--enable-locale', or then Im wrong.

> b) the database is running with locale en_US
> c) the column the like is being performed on is indexed

Dangerous LIKE optimation.

...
> The current implementation for converting the like into an index scan
> doesn't work with Locale support enabled and the en_US locale as shown

Hmm. If memory serves its dropped in the later builds (no like optimation).

// Jarmo

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Mayers, Philip J 2000-12-09 12:20:47 RE: Help interpreting the output of EXPLAIN
Previous Message Tatsuo Ishii 2000-12-09 07:47:21 Re: Japan pictures

Browse pgsql-hackers by date

  From Date Subject
Next Message Oleg Bartunov 2000-12-09 12:40:53
Previous Message Oleg Bartunov 2000-12-09 08:50:17 Re: OK, does anyone have any better ideas?