Re: patch adding new regexp functions

From: Jeremy Drake <pgsql(at)jdrake(dot)com>
To: Neil Conway <neilc(at)samurai(dot)com>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>, David Fetter <david(at)fetter(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: patch adding new regexp functions
Date: 2007-02-10 09:26:00
Message-ID: Pine.BSO.4.64.0702100111320.28908@resin.csoft.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Sat, 10 Feb 2007, Jeremy Drake wrote:

> On Sat, 10 Feb 2007, Neil Conway wrote:
>
> > * I'm not clear about the control flow in regexp_matches() and
> > regexp_split(). Presumably it's not possible for the call_cntr to
> > actually exceed max_calls, so the error message in these cases should be
> > elog(ERROR), not ereport (the former is for "shouldn't happen" bug
> > scenarios, the latter is for user-facing errors). Can you describe the
> > logic here (e.g. via comments) a bit more?
>
> I added some comments, and changed to using elog instead of ereport.

I fixed a couple more things in this patch. I changed the max calls limit
to the real limit, rather than the arbitrarily high limit that was
previously set (three times the length of the string in bytes). Also, I
changed the checks for offset to compare against wide_len rather than
orig_len, since in multibyte character sets orig_len is the length in
bytes of the string in whatever encoding it is in, while wide_len is the
length in characters, which is what everything else in these functions
deal with.

The calls to text_substr have me somewhat concerned now, also. I think
performance starts to look like O(n^2) in multibyte character sets. But I
think doing anything about it would require this code to know more about
the internals of text than it has any right to. I guess settle for the
correctness now, and if performance is a problem this can be addressed.
Would hate to make this code even more ugly due to premature
optimization...

--
When does summertime come to Minnesota, you ask?
Well, last year, I think it was a Tuesday.

Attachment Content-Type Size
regexp-split-matches-documented_new-6.patch text/plain 47.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2007-02-10 09:30:31 Re: [PATCHES] How can I use 2GB of shared buffers on Windows?
Previous Message Jeremy Drake 2007-02-10 08:33:59 Re: patch adding new regexp functions

Browse pgsql-patches by date

  From Date Subject
Next Message Magnus Hagander 2007-02-10 09:30:31 Re: [PATCHES] How can I use 2GB of shared buffers on Windows?
Previous Message Jeremy Drake 2007-02-10 08:33:59 Re: patch adding new regexp functions