From: | Dennis Gearon <gearond(at)fireserve(dot)net> |
---|---|
To: | John Sidney-Woollett <johnsw(at)wardbrook(dot)com> |
Cc: | pgsql-general(at)postgresql(dot)org |
Subject: | Re: langauges, locales, regex, LIKE |
Date: | 2004-06-24 16:23:05 |
Message-ID: | 40DAFFE9.7080400@fireserve.net |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-general |
John Sidney-Woollett wrote:
> For what it's worth, we have a unicode 7.4.1 database which gives us the
> sorting and searching behaviour that we expect (with the exception of
> the upper and lower functions). We access the data via jdbc so we don't
> have to deal with encoding issues per se as the driver does any
> translation for us.
>
> Currently we don't use any LIKE statements, but if we did, and wanted
> them optimized then we'd use the appropriate OP Class when defining the
> index. We also don't use any REGEX expressions. And we'll shortly be
> experimenting with tsearch2...
>
> List of databases
> Name | Owner | Encoding
> ---------------+----------+----------
> test | postgres | UNICODE
>
> Setting the psql client encoding to Latin1 and inserting the following
> data...
>
> # select * from johntest;
> id | value
> ----+-------
> 1 | test
> 2 | tést
> 3 | tèst
> 4 | taste
> 5 | TEST
> 6 | TÉST
> 7 | TÈST
> 8 | TASTE
> (8 rows)
>
> and then extracting the data in sorted order works as we would expect
>
> # select * from johntest order by value (no index on the value field)
> id | value
> ----+-------
> 8 | TASTE
> 5 | TEST
> 7 | TÈST
> 6 | TÉST
> 4 | taste
> 1 | test
> 3 | tèst
> 2 | tést
> (8 rows)
>
> however, applying the UPPER function to the data does not work as
> expected, problem with ids 6,7,3,2 - should be ordered (3,7 or 7,3) ,
> (6,2 or 2,6)
>
> # select * from johntest order by upper(value);
> id | value
> ----+-------
> 4 | taste
> 8 | TASTE
> 1 | test
> 5 | TEST
> 7 | TÈST
> 6 | TÉST
> 3 | tèst
> 2 | tést
> (8 rows)
>
> using a LIKE operation also works as expected (again no index on value
> field)
>
> # select * from johntest where value like 't%';
> id | value
> ----+-------
> 1 | test
> 2 | tést
> 3 | tèst
> 4 | taste
> (4 rows)
>
Like works, but it can't use an index, and so would have horibble performance vs. the situation where it CAN use an index. I believe this is how Postgres is working now.
From | Date | Subject | |
---|---|---|---|
Next Message | John Sidney-Woollett | 2004-06-24 16:29:10 | Re: langauges, locales, regex, LIKE |
Previous Message | Dennis Gearon | 2004-06-24 16:20:35 | Re: langauges, locales, regex, LIKE |