Re: UTF-8 and LIKE vs =

From: Joel <rees(at)ddcom(dot)co(dot)jp>
To: pgsql-general(at)postgresql(dot)org
Cc: David Wheeler <david(at)kineticode(dot)com>
Subject: Re: UTF-8 and LIKE vs =
Date: 2004-08-24 01:58:59
Message-ID: 20040824105027.1B8D.REES@ddcom.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 24 Aug 2004 01:34:46 +0200
Ian Barwick <barwick(at)gmail(dot)com> wrote

> ...
> wild speculation in need of a Korean speaker, but:
>
> ian(at)linux:~/tmp> cat j.txt
> 繝せ繝
> 俾イス、
> イlラ
> ケク
> ュゥ
> 復
> 縺ヲ縺吶→
> ian(at)linux:~/tmp> uniq j.txt
> 繝せ繝
> 俾イス、
> 縺ヲ縺吶→
>
> All but the first and last lines are random Korean (Hangul)
> characters. Evidently our respective locales think all Hangul strings
> of the same length are identical, which is very probably not the
> case...

My browser just nicely botched replying on those, but looking at Ian's
post, the first and last lines looked like "test" written in Japanese,
the first line in katakana and the last line in hiragana.

The following should end up posted as shift-JIS, but

テスト
and
てすと

should collate the same under some contexts, since it's more-or-less
equivalent to a variation in case.

--
Joel <rees(at)ddcom(dot)co(dot)jp>

In response to

Browse pgsql-general by date

  From Date Subject
Next Message jimworke 2004-08-24 02:10:57 Re: Unsupported 3rd-party solutions (Was: Few questions
Previous Message David Wheeler 2004-08-24 01:56:07 Re: UTF-8 and LIKE vs =