Quick Links

Re: A thought about regex versus multibyte character sets

From:	Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: A thought about regex versus multibyte character sets
Date:	2009-12-01 03:13:10
Message-ID:	17821.1259637190@sss.pgh.pa.us
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

I wrote:
> I therefore propose the following idea: if the database encoding is
> UTF8, allow the regc_locale.c functions to call the <wctype.h>
> functions, assuming that wchar_t and pg_wchar_t share the same
> representation. On platforms where wchar_t is only 16 bits, we can do
> this up to U+FFFF and be stupid about code points above that.

Or to be concrete, how about the attached? It seems to do what's
wanted, but I'm hardly the best-qualified person to test it.

regards, tom lane

Attachment	Content-Type	Size
utf8-regex-1.patch	text/x-patch	6.6 KB

In response to

A thought about regex versus multibyte character sets at 2009-11-30 18:15:06 from Tom Lane

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Greg Smith	2009-12-01 03:16:23	Re: CommitFest status/management
Previous Message	Bruce Momjian	2009-12-01 02:36:31	Re: ProcessUtility_hook