Re: UTF-8 and LIKE vs =

From: Ian Barwick <barwick(at)gmail(dot)com>
To: Markus Bertheau <twanger(at)bluetwanger(dot)de>
Cc: David Wheeler <david(at)kineticode(dot)com>, pgsql-general(at)postgresql(dot)org
Subject: Re: UTF-8 and LIKE vs =
Date: 2004-08-23 23:34:46
Message-ID: 1d581afe040823163456af8598@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 24 Aug 2004 00:46:50 +0200, Markus Bertheau
<twanger(at)bluetwanger(dot)de> wrote:
>
>
> В Пнд, 23.08.2004, в 23:04, David Wheeler пишет:
> > On Aug 23, 2004, at 1:58 PM, Ian Barwick wrote:
> >
> > > er, the characters in "name" don't seem to match the characters in the
> > > query - '국방비' vs. '북한의' - does that have any bearing?
> >
> > Yes, it means that = is doing the wrong thing!!
>
> The collation rules of your (and my) locale say that these strings are
> the same:
>
> [markus(at)teetnang markus]$ cat > t
> 국방비
> 북한의
> [markus(at)teetnang markus]$ uniq t
> 국방비
> [markus(at)teetnang markus]$

wild speculation in need of a Korean speaker, but:

ian(at)linux:~/tmp> cat j.txt
テスト
환경설
전검색
웹문서
국방비
북한의
てすと
ian(at)linux:~/tmp> uniq j.txt
テスト
환경설
てすと

All but the first and last lines are random Korean (Hangul)
characters. Evidently our respective locales think all Hangul strings
of the same length are identical, which is very probably not the
case...

Ian Barwick

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Tom Lane 2004-08-23 23:35:29 Re: UTF-8 and LIKE vs =
Previous Message Josué Maldonado 2004-08-23 23:30:11 Re: pg_dump/psql fails on win32 beta 8.0