UTF-8 ilike case insensitive search

From: Martin Edlman <edlman(at)fortech(dot)cz>
To: PostgreSQL bugs <pgsql-bugs(at)postgresql(dot)org>
Subject: UTF-8 ilike case insensitive search
Date: 2005-07-15 11:16:55
Message-ID: 42D79B27.7000107@fortech.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

I have a problem with ilike search operator with czech characters. It's
not working case insensitive after changing from Postgres SQL 7.3 with
ISO8829-2 encoding to PostgreSQL 8.0 with UTF-8 encoding.

I have dumped the old database, converted to from iso88295-2 to utf-8
using iconv and imported it to the new database.

I have a table "customer" with column "surname" and there is a record
with surname "Červenka" (first letter is "C" with caron). When I execute

select * from customer where surname ilike '%čer%'; -- "c" with caron

I get an empty result. After changing to capital letter I get the answer

select * from customer where surname ilike '%Čer%'; -- "C" with caron

I thought it's some locale related problem, but when I use upper and
lower functions, everything is OK.

cust=# select upper('čer'), lower('ČER');
upper | lower
-------+-------
ČER | čer

When using 'order by' clause I get the data ordered correctly according
to the Czech habits. That's fine.

System locale is cs_CZ.UTF-8, database was initialized by initdb with
the same locale, postgres.conf contains

lc_messages = 'cs_CZ.UTF-8'
lc_monetary = 'cs_CZ.UTF-8'
lc_numeric = 'cs_CZ.UTF-8'
lc_time = 'cs_CZ.UTF-8'

pg_controldata returns

[postgres(at)acc ~]$ pg_controldata /var/lib/pgsql/data
číslo verze pg_controlu: 74
LC_COLLATE (porovnávání řetězců): cs_CZ.UTF-8
LC_CTYPE (typy znaků): cs_CZ.UTF-8

PostgreSQL is running with environment with LANG set to cs_CZ.UTF-8

$ ps auxef | grep postmaster
/usr/bin/postmaster -p 5432 -D /var/lib/pgsql/data USER=postgres
MAIL=/var/spool/mail/postgres
PATH=/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/usr/X11R6/bin
INPUTRC=/etc/inputrc PWD=/var/lib/pgsql LANG=cs_CZ.UTF-8 SHLVL=1
HOME=/var/lib/pgsql LOGNAME=postgres PGDATA=/var/lib/pgsql/data

Is it a bug in PostgreSQL or am I missing something?

--
Regards,

Martin Edlman
Fortech s.r.o, Litomysl
Public PGP key: http://edas.visaci.cz/#keys

Browse pgsql-bugs by date

  From Date Subject
Next Message Dr. Volker Goebbels 2005-07-15 17:29:36 Re: [postgres] An den Admin
Previous Message Tom Lane 2005-07-15 05:26:45 Re: BUG #1770: Composite type dependency broken