Quick Links

Re: Patch: add conversion from pg_wchar to multibyte

From:	Tatsuo Ishii <ishii(at)postgresql(dot)org>
To:	robertmhaas(at)gmail(dot)com
Cc:	ishii(at)postgresql(dot)org, aekorotkov(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Patch: add conversion from pg_wchar to multibyte
Date:	2012-07-03 06:17:47
Message-ID:	20120703.151747.1330940307954703732.t-ishii@sraoss.co.jp
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

> OK. So, in that case, I suggest that if the leading byte is non-zero,
> we emit 0x9d followed by the three available bytes, instead of first
> testing whether the first byte is >= 0xf0. That test seems to serve
> no purpose but to confuse the issue.

Probably the code shoud look like this(see below comment):

else if (lb >= 0xf0 && lb <= 0xfe)
{
if (lb <= 0xf4)
*to++ = 0x9c;
else
*to++ = 0x9d;
*to++ = lb;
*to++ = (*from >> 8) & 0xff;
*to++ = *from & 0xff;
cnt += 4;

> I further suggest that we improve the comments on the mule functions
> for both wchar->mb and mb->wchar to make all this more clear.

I have added comments about mule internal encoding by refreshing my
memory and from old document found on
web(http://mibai.tec.u-ryukyu.ac.jp/cgi-bin/info2www?%28mule%29Buffer%20and%20string).

Please take a look at. BTW, it seems conversion between multibyte and
wchar can be roundtrip in the leading character is LCPRV2 case:

If the second byte of wchar (out of 4 bytes of wchar. The first byte
is always 0x00) is in range of 0xf0 to 0xf4, then the first byte of
multibyte must be 0x9c. If the second byte of wchar is in range of
0xf5 to 0xfe, then the first byte of multibyte must be 0x9d.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

Attachment	Content-Type	Size
pg_wchar.h.patch	text/x-patch	1.7 KB

In response to

Re: Patch: add conversion from pg_wchar to multibyte at 2012-07-03 02:56:18 from Robert Haas

Responses

Re: Patch: add conversion from pg_wchar to multibyte at 2012-07-03 21:41:11 from Alexander Korotkov
Re: Patch: add conversion from pg_wchar to multibyte at 2012-07-03 22:05:14 from Tatsuo Ishii

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Jeff Davis	2012-07-03 06:47:58	Re: SP-GiST for ranges based on 2d-mapping and quad-tree
Previous Message	Pavel Stehule	2012-07-03 06:13:21	Re: enhanced error fields