From: | Hannu Krosing <hannu(at)tm(dot)ee> |
---|---|
To: | Tatsuo Ishii <t-ishii(at)sra(dot)co(dot)jp> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Status report: regex replacement |
Date: | 2003-02-06 16:00:59 |
Message-ID: | 1044547258.22076.2.camel@huli |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Thu, 2003-02-06 at 13:25, Tatsuo Ishii wrote:
> > I have just committed the latest version of Henry Spencer's regex
> > package (lifted from Tcl 8.4.1) into CVS HEAD. This code is natively
> > able to handle wide characters efficiently, and so it avoids the
> > multibyte performance problems recently exhibited by Wade Klaver.
> > I have not done extensive performance testing, but the new code seems
> > at least as fast as the old, and much faster in some cases.
>
> I have tested the new regex with src/test/mb and it all passed. So the
> new code looks safe at least for EUC_CN, EUC_JP, EUC_KR, EUC_TW,
> MULE_INTERNAL, UNICODE, though the test does not include all possible
> regex patterns.
Perhaps we should not call the encoding UNICODE but UTF8 (which it
really is). UNICODE is a character set which has half a dozen official
encodings and calling one of them "UNICODE" does not make things very
clear.
--
Hannu Krosing <hannu(at)tm(dot)ee>
From | Date | Subject | |
---|---|---|---|
Next Message | Andrew Sullivan | 2003-02-06 16:15:30 | Re: databases limit |
Previous Message | Tom Lane | 2003-02-06 15:19:12 | Re: lock.h and proc.h |