Re: Patch: add conversion from pg_wchar to multibyte

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Tatsuo Ishii <ishii(at)postgresql(dot)org>
Cc: robertmhaas(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add conversion from pg_wchar to multibyte
Date: 2012-07-03 21:41:11
Message-ID: CAPpHfdssF4epQsghxDyyw_=8=tscHaXCU-wx14EoQwTsipvrEw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 3, 2012 at 10:17 AM, Tatsuo Ishii <ishii(at)postgresql(dot)org> wrote:

> > OK. So, in that case, I suggest that if the leading byte is non-zero,
> > we emit 0x9d followed by the three available bytes, instead of first
> > testing whether the first byte is >= 0xf0. That test seems to serve
> > no purpose but to confuse the issue.
>
> Probably the code shoud look like this(see below comment):
>
> else if (lb >= 0xf0 && lb <= 0xfe)
> {
> if (lb <= 0xf4)
> *to++ = 0x9c;
> else
> *to++ = 0x9d;
> *to++ = lb;
> *to++ = (*from >> 8) & 0xff;
> *to++ = *from & 0xff;
> cnt += 4;

It's likely we also need to assign some names to all these numbers
(0xf0, 0xf4, 0xfe, 0x9c, 0x9d). But it's hard for me to invent such names.

> > I further suggest that we improve the comments on the mule functions
> > for both wchar->mb and mb->wchar to make all this more clear.
>
> I have added comments about mule internal encoding by refreshing my
> memory and from old document found on
> web(
> http://mibai.tec.u-ryukyu.ac.jp/cgi-bin/info2www?%28mule%29Buffer%20and%20string
> ).
>
> Please take a look at. BTW, it seems conversion between multibyte and
> wchar can be roundtrip in the leading character is LCPRV2 case:
>
> If the second byte of wchar (out of 4 bytes of wchar. The first byte
> is always 0x00) is in range of 0xf0 to 0xf4, then the first byte of
> multibyte must be 0x9c. If the second byte of wchar is in range of
> 0xf5 to 0xfe, then the first byte of multibyte must be 0x9d.

Should I intergrate these code changes into my patch? Or we would like to
commit them first?

------
With best regards,
Alexander Korotkov.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Korotkov 2012-07-03 21:46:26 Re: Incorrect behaviour when using a GiST index on points
Previous Message Alvaro Herrera 2012-07-03 21:38:55 Re: [PATCH] lock_timeout and common SIGALRM framework