Skip site navigation (1) Skip section navigation (2)

Re: A thought about regex versus multibyte character sets

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: A thought about regex versus multibyte character sets
Date: 2009-12-01 03:13:10
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackers
I wrote:
> I therefore propose the following idea: if the database encoding is
> UTF8, allow the regc_locale.c functions to call the <wctype.h>
> functions, assuming that wchar_t and pg_wchar_t share the same
> representation.  On platforms where wchar_t is only 16 bits, we can do
> this up to U+FFFF and be stupid about code points above that.

Or to be concrete, how about the attached?  It seems to do what's
wanted, but I'm hardly the best-qualified person to test it.

			regards, tom lane

Attachment: utf8-regex-1.patch
Description: text/x-patch (6.6 KB)

In response to

pgsql-hackers by date

Next:From: Greg SmithDate: 2009-12-01 03:16:23
Subject: Re: CommitFest status/management
Previous:From: Bruce MomjianDate: 2009-12-01 02:36:31
Subject: Re: ProcessUtility_hook

Privacy Policy | About PostgreSQL
Copyright © 1996-2017 The PostgreSQL Global Development Group