Quick Links

Re: UTF-8 and LIKE vs =

From:	David Wheeler <david(at)kineticode(dot)com>
To:	Ian Barwick <barwick(at)gmail(dot)com>
Cc:	pgsql-general(at)postgresql(dot)org
Subject:	Re: UTF-8 and LIKE vs =
Date:	2004-08-23 21:04:05
Message-ID:	F820D962-F547-11D8-990D-000A95972D84@kineticode.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-general

On Aug 23, 2004, at 1:58 PM, Ian Barwick wrote:

> er, the characters in "name" don't seem to match the characters in the
> query - '국방비' vs. '북한의' - does that have any bearing?

Yes, it means that = is doing the wrong thing!!

I noticed this because I had a query that was looking in the keyword
table for an existing record using LIKE. If it didn't find it, it
inserted it. But the inserts were giving me an error because the name
column has a UNIQUE index on it. Could it be that the index and the =
operator are comparing bytes, and that '국방비' and '북한의' have the same
bytes but different characters??

If so, this is a pretty serious problem. How can I get = and the
indices to use character semantics rather than byte semantics? I also
need to be able to store data in different languages in the database
(and in the same column!), but all in Unicode.

TIA,

David

In response to

Re: UTF-8 and LIKE vs = at 2004-08-23 20:58:45 from Ian Barwick

Responses

Re: UTF-8 and LIKE vs = at 2004-08-23 21:25:05 from Ian Barwick
Re: UTF-8 and LIKE vs = at 2004-08-23 21:46:40 from David Wheeler
Re: UTF-8 and LIKE vs = at 2004-08-23 22:44:47 from Tom Lane
Re: UTF-8 and LIKE vs = at 2004-08-23 22:46:50 from Markus Bertheau
Re: UTF-8 and LIKE vs = at 2004-08-23 23:34:46 from Ian Barwick

Browse pgsql-general by date

	From	Date	Subject
Next Message	Carlos Moreno	2004-08-23 21:06:15	Deadlocks -- what can I do about them?
Previous Message	Ian Barwick	2004-08-23 20:58:45	Re: UTF-8 and LIKE vs =