Re: Patch: add conversion from pg_wchar to multibyte

From: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tatsuo Ishii <ishii(at)postgresql(dot)org>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Patch: add conversion from pg_wchar to multibyte
Date: 2012-07-02 20:46:03
Message-ID: CAPpHfdvjejw0d5XyHoLXhvBpNiYiK_YbTN9395KGRjOMpqANPg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Jul 3, 2012 at 12:37 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:

> On Mon, Jul 2, 2012 at 4:33 PM, Alexander Korotkov <aekorotkov(at)gmail(dot)com>
> wrote:
> > On Mon, Jul 2, 2012 at 8:12 PM, Robert Haas <robertmhaas(at)gmail(dot)com>
> wrote:
> >>
> >> On Sun, Jul 1, 2012 at 5:11 AM, Alexander Korotkov <
> aekorotkov(at)gmail(dot)com>
> >> wrote:
> >> >> MULE also looks problematic. The code that you've written isn't
> >> >> symmetric with the opposite conversion, unlike what you did in all
> >> >> other cases, and I don't understand why. I'm also somewhat baffled
> by
> >> >> the reverse conversion: it treats a multi-byte sequence beginning
> with
> >> >> a byte for which IS_LCPRV1(x) returns true as invalid if there are
> >> >> less than 3 bytes available, but it only reads two; similarly, for
> >> >> IS_LCPRV2(x), it demands 4 bytes but converts only 3.
> >> >
> >> > Should we save existing pg_wchar representation for MULE encoding?
> >> > Probably,
> >> > we can modify it like in 0.1 version of patch in order to make it more
> >> > transparent.
> >>
> >> Changing the encoding would break pg_upgrade, so -1 from me on that.
> >
> >
> > I didn't realize that we store pg_wchar on disk somewhere. I thought it
> is
> > only in-memory representation. Where do we store pg_wchar on disk?
>
> OK, now I'm confused. I was thinking (incorrectly) that you were
> talking about changing the multibyte encoding, which of course is
> saved on disk all over the place. Changing the wchar encoding is a
> different kettle of fish, and I have no idea what that would or would
> not break. But I don't see why we'd want to do such a thing. We just
> need to make the MB->WCHAR and WCHAR->MB transformations mirror images
> of each other; why is that hard?

So, I provided such transformation in versions 0.3 and 0.4 based on
explanation from Tatsuo Ishii. The problem is that both conversions are
nontrivial and it's not evident that they are mirror (understanding that
they are mirror require some additional assumptions about encodings, not
evident just by transformation itself). I though you mention that problem
two message back.

------
With best regards,
Alexander Korotkov.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dimitri Fontaine 2012-07-02 20:53:17 Re: Event Triggers reduced, v1
Previous Message Robert Haas 2012-07-02 20:37:13 Re: Patch: add conversion from pg_wchar to multibyte