Quick Links

Re: Patch for collation using ICU

From:	Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>
To:	john(at)geeknet(dot)com(dot)au
Cc:	pgman(at)candle(dot)pha(dot)pa(dot)us, girgen(at)pingpong(dot)net, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Patch for collation using ICU
Date:	2005-05-08 13:08:27
Message-ID:	20050508.220827.104049106.t-ishii@sra.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> > I don't buy it. If current conversion tables does the right
> > thing, why we need to replace. Or if conversion tables are
> > not correct, why don't you fix it? I think the rule of
> > character conversion will not change frequently, especially
> > for LATIN languages. Thus maintaining cost is not too high.
>
> I never said we need to, but if we're going to implement ICU,
> then we might as well go all the way.

So you admit there's no benefit using ICU for replacing existing
conversions?

Besides ICU does not support all existing conversions, I think ICU has
serious flaw for using conversion. If I understand correctly, ICU uses
UNICODE internally to do the conversion. For example, to implement
SJIS->EUC_JP conversion, ICU first converts SJIS to UNICODE then
converts UNICODE to EUC_JP. Problem is these conversion is not roud
trip(conversion between SJIS/EUC_JP and UNICODE will lose some
information). Thus SJIS->EUC_JP->SJIS conversion using ICU does not
preserve original text.
--
Tatsuo Ishii

In response to

Re: Patch for collation using ICU at 2005-05-08 03:59:05 from John Hansen

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Tatsuo Ishii	2005-05-08 13:19:25	Re: Patch for collation using ICU
Previous Message	John Hansen	2005-05-08 12:26:36	Re: Patch for collation using ICU