Re: Operator "=" not unicode-safe?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Jörg Haustein <Joerg(dot)Haustein(at)urz(dot)uni-heidelberg(dot)de>
Cc: pgsql-bugs(at)postgresql(dot)org
Subject: Re: Operator "=" not unicode-safe?
Date: 2005-08-19 18:40:20
Message-ID: 26649.1124476820@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

=?UTF-8?B?SsO2cmcgSGF1c3RlaW4=?= <Joerg(dot)Haustein(at)urz(dot)uni-heidelberg(dot)de> writes:
> I have a UNICODE database, trying to compare two unicode strings (Ethiopic
> characters). Client encoding is also UNICODE:
> ===================================================
> testdb=> select ' '=' ';
> ?column?
> ----------
> t
> (1 row)

> Clearly, it can be seen that they are not equal.

Sounds to me like you chose a locale that is expecting some non-Unicode
encoding. "=" ultimately depends on the system's strcoll() routine,
and in many locales strcoll doesn't behave very sanely when handed data
that's illegal in whatever it thinks the encoding is.

Redo your initdb in a locale that is UTF-8 based, and make sure to keep
the database encoding UTF8. The apparent flexibility to choose
different database encodings really only works if the underlying locale
is "C".

There is a warning about this in the docs, though perhaps not prominent
enough:
http://www.postgresql.org/docs/8.0/static/multibyte.html#AEN20633

regards, tom lane

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Jim C. Nasby 2005-08-19 20:58:03 Re: [GENERAL] BUG #1830: Non-super-user must be able to copy
Previous Message Judith Altamirano 2005-08-19 18:32:41 BUG #1838: IndexSupportInitialze