Re: langauges, locales, regex, LIKE

From: Dennis Gearon <gearond(at)fireserve(dot)net>
To: John Sidney-Woollett <johnsw(at)wardbrook(dot)com>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: langauges, locales, regex, LIKE
Date: 2004-06-24 16:23:05
Message-ID: 40DAFFE9.7080400@fireserve.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

John Sidney-Woollett wrote:

> For what it's worth, we have a unicode 7.4.1 database which gives us the
> sorting and searching behaviour that we expect (with the exception of
> the upper and lower functions). We access the data via jdbc so we don't
> have to deal with encoding issues per se as the driver does any
> translation for us.
>
> Currently we don't use any LIKE statements, but if we did, and wanted
> them optimized then we'd use the appropriate OP Class when defining the
> index. We also don't use any REGEX expressions. And we'll shortly be
> experimenting with tsearch2...
>
> List of databases
> Name | Owner | Encoding
> ---------------+----------+----------
> test | postgres | UNICODE
>
> Setting the psql client encoding to Latin1 and inserting the following
> data...
>
> # select * from johntest;
> id | value
> ----+-------
> 1 | test
> 2 | tést
> 3 | tèst
> 4 | taste
> 5 | TEST
> 6 | TÉST
> 7 | TÈST
> 8 | TASTE
> (8 rows)
>
> and then extracting the data in sorted order works as we would expect
>
> # select * from johntest order by value (no index on the value field)
> id | value
> ----+-------
> 8 | TASTE
> 5 | TEST
> 7 | TÈST
> 6 | TÉST
> 4 | taste
> 1 | test
> 3 | tèst
> 2 | tést
> (8 rows)
>
> however, applying the UPPER function to the data does not work as
> expected, problem with ids 6,7,3,2 - should be ordered (3,7 or 7,3) ,
> (6,2 or 2,6)
>
> # select * from johntest order by upper(value);
> id | value
> ----+-------
> 4 | taste
> 8 | TASTE
> 1 | test
> 5 | TEST
> 7 | TÈST
> 6 | TÉST
> 3 | tèst
> 2 | tést
> (8 rows)
>
> using a LIKE operation also works as expected (again no index on value
> field)
>
> # select * from johntest where value like 't%';
> id | value
> ----+-------
> 1 | test
> 2 | tést
> 3 | tèst
> 4 | taste
> (4 rows)
>
Like works, but it can't use an index, and so would have horibble performance vs. the situation where it CAN use an index. I believe this is how Postgres is working now.

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message John Sidney-Woollett 2004-06-24 16:29:10 Re: langauges, locales, regex, LIKE
Previous Message Dennis Gearon 2004-06-24 16:20:35 Re: langauges, locales, regex, LIKE