Re: regexp character class locale awareness patch

From: Manuel Sugawara <masm(at)fciencias(dot)unam(dot)mx>
To: Peter Eisentraut <peter_e(at)gmx(dot)net>
Cc: Bruce Momjian <pgman(at)candle(dot)pha(dot)pa(dot)us>, Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: regexp character class locale awareness patch
Date: 2002-04-16 03:32:30
Message-ID: m38z7otj1t.fsf@dep4.fciencias.unam.mx
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

According to POSIX -regex (7)-, standard character class are:

alnum digit punct
alpha graph space
blank lower upper
cntrl print xdigi

Many of that classes are different in different locales, and currently
all work as if the localization were C. Many of those tests have
multibyte issues, however with the patch postgres will work for
one-byte encondings, which is better than nothing. If someone
(Tatsuo?) gives some advice I will work in the multibyte version.

Peter Eisentraut <peter_e(at)gmx(dot)net> writes:
>
> Basically, you manually preprocess the patch to include the
> USE_LOCALE branch and remove the not USE_LOCALE branch.

Yeah, that should work. You may also remove include/regex/cclass.h
since it will not be used any more.

> However, if the no-locale branches have significant performance
> benefits then it might be worth pondering setting up some
> optimizations.

This is not the case.

Regards,
Manuel.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2002-04-16 03:34:04 Re: [PATCHES] [SQL] 16 parameter limit
Previous Message Josh Berkus 2002-04-16 03:25:20 Re: [SQL] 16 parameter limit