Re: UTF-8 and LIKE vs =

From: David Wheeler <david(at)kineticode(dot)com>
To: Markus Bertheau <twanger(at)bluetwanger(dot)de>
Cc: Ian Barwick <barwick(at)gmail(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: UTF-8 and LIKE vs =
Date: 2004-08-23 22:58:33
Message-ID: F5FA9A60-F557-11D8-990D-000A95972D84@kineticode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Aug 23, 2004, at 3:46 PM, Markus Bertheau wrote:

> The collation rules of your (and my) locale say that these strings are
> the same:
>
> [markus(at)teetnang markus]$ cat > t
> 국방비
> 북한의
> [markus(at)teetnang markus]$ uniq t
> 국방비
> [markus(at)teetnang markus]$

Interesting.

> Make sure that you have initdb'd the database under the right locale.
> There's not much PostgreSQL can do if strcoll() says that the strings
> are equal.

Well, I have data from a number of different locales in the same
database. I'm hoping that setting the locale to "C" will do the trick.
It seems to work properly on my Mac:

sharky=# select * from keyword where name = '국방비';
id | name | screen_name | sort_name | active
----+--------+-------------+-----------+--------
0 | 국방비 | 국방비 | 국방비 | 1
(1 row)

sharky=# select * from keyword where name = '북한의';
id | name | screen_name | sort_name | active
----+------+-------------+-----------+--------
(0 rows)

sharky=# select * from keyword where name like '북한의';
id | name | screen_name | sort_name | active
----+------+-------------+-----------+--------
(0 rows)

sharky=# select * from keyword where lower(name) like '국방비';
id | name | screen_name | sort_name | active
----+--------+-------------+-----------+--------
0 | 국방비 | 국방비 | 국방비 | 1
(1 row)

Regards,

David

In response to

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-08-23 22:59:13 Re: UTF-8 and LIKE vs =
Previous Message David Wheeler 2004-08-23 22:50:07 Re: UTF-8 and LIKE vs =