Quick Links

Patch: add conversion from pg_wchar to multibyte

From:	Alexander Korotkov <aekorotkov(at)gmail(dot)com>
To:	pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Patch: add conversion from pg_wchar to multibyte
Date:	2012-04-23 08:48:20
Message-ID:	CAPpHfdshcHe1ZPQhyd2xhAKnNu0VpdMPuGFtvribqJcnH0K2Ew@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hackers,

attached patch adds conversion from pg_wchar string to multibyte string.
This functionality is needed for my patch on index support for regular
expression search
http://archives.postgresql.org/pgsql-hackers/2011-11/msg01297.php .
Analyzing conversion from multibyte to pg_wchar I found following types of
conversion:
1) Trivial conversion for single-byte encoding. It just adds leading zeros
to each byte.
2) Conversion from UTF-8 to unicode.
3) Conversions from euc* encodings. They write bytes of a character to
pg_wchar in inverse order starting from lower byte (this explanation assume
little endian system).
4) Conversion from mule encoding. This conversion is unclear for me and
also seems to be lossy.

It was easy to write inverse conversion for 1-3. I've changed 4 conversion
to behave like 3. I'm not sure my change is ok, because I didn't understand
original conversion.

------
With best regards,
Alexander Korotkov.

Attachment	Content-Type	Size
wchar2mb-0.1.patch	application/octet-stream	15.6 KB

Responses

Re: Patch: add conversion from pg_wchar to multibyte at 2012-05-21 22:37:54 from Alexander Korotkov

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Boszormenyi Zoltan	2012-04-23 08:53:36	Re: [PATCH] lock_timeout and common SIGALRM framework
Previous Message	Jan Urbański	2012-04-23 00:25:41	Re: plpython triggers are broken for composite-type columns