Operator "=" not unicode-safe?

From: Jörg Haustein <Joerg(dot)Haustein(at)urz(dot)uni-heidelberg(dot)de>
To: pgsql-bugs(at)postgresql(dot)org
Subject: Operator "=" not unicode-safe?
Date: 2005-08-19 17:44:15
Message-ID: 43061A6F.7020204@urz.uni-heidelberg.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

PostgreSQL version: 8.0.3
Operating system: Linux (SuSE 9.1)

I have a UNICODE database, trying to compare two unicode strings (Ethiopic
characters). Client encoding is also UNICODE:
===================================================
testdb=> select 'በድሩ ሁሴን'='ሰይፉ ከበደ';
?column?
----------
t
(1 row)

Clearly, it can be seen that they are not equal. The "LIKE" operator also
seems to think so:

testdb=> select 'በድሩ ሁሴን' LIKE 'ሰይፉ ከበደ';
?column?
----------
f
(1 row)
===================================================

What is the problem here?
The behavior is the same with SQL_ASCII databases and the SQL_ASCII client
encoding.

Of course one could always overload the operator or just use LIKE. But
where it really matters is with queries using UNION, EXCEPT or INTERSECT:

==========================

testdb=> select a from a;
a
---------
በድሩ ሁሴን
ሰይፉ ከበደ
(2 rows)

testdb=> select a from b;
a
---------
ሰይፉ ከበደ
(1 row)

testdb=> select a from a union select a from b;
a
---------
በድሩ ሁሴን
(1 row)

testdb=> select a from a except select a from b;
a
---
(0 rows)

testdb=> select a from a intersect select a from b;
a
---------
በድሩ ሁሴን
(1 row)
==========================

What can I do?
With kind regards,

Jörg Haustein

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Judith Altamirano 2005-08-19 18:32:41 BUG #1838: IndexSupportInitialze
Previous Message Michael Fuhr 2005-08-19 15:10:45 Re: BUG #1831: plperl gives error after reconnect.