Skip site navigation (1) Skip section navigation (2)

Re: [HACKERS] writing new regexp functions

From: Jeremy Drake <pgsql(at)jdrake(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL Patches <pgsql-patches(at)postgresql(dot)org>
Subject: Re: [HACKERS] writing new regexp functions
Date: 2007-02-02 03:29:35
Message-ID: (view raw, whole thread or download thread mbox)
Lists: pgsql-hackerspgsql-patches
On Thu, 1 Feb 2007, Jeremy Drake wrote:

> On Thu, 1 Feb 2007, Tom Lane wrote:
> > Jeremy Drake <pgsql(at)jdrake(dot)com> writes:
> > > Is there some specific reason that these functions are static,
> >
> > Yeah: not cluttering the global namespace.
> > Is there a reason for not putting your new code itself into regexp.c?
> Not really, I just figured it would be cleaner/easier to write it as an
> extension.  I also figure that it is unlikely that every regexp function
> that anyone could possibly want will be implemented in core in that one
> file.

> Anyway, the particular thing I was writing was a function like
> substring(str FROM pattern) which instead of returning just the first
> match group, would return an array of text containing all of the match
> groups.  I exported the functions in my sandbox, and wrote a module with a
> function that does this.

I have attached the patch I have put together, which does the following:
* Expose the previously static RE_* functions from regexp.c which wrap
  the code in src/backend/regex with postgres-style errors, string
  conversion, and caching of patterns.

* expose regex_flavor guc var, which is needed to know how to interpret
  patterns when compiling them

* Add a couple more RE_* functions in regexp.c to provide access
  to different levels of the process, which were necessary to avoid
  duplicating effort elsewhere.

* Update replace_text_regexp in varlena.c to use newly exposed functions
  from regexp.c instead of duplicating error handling code from there.

Also attached is the function I wrote to retrieve all of the capture
groups in a pattern match in a text[].  I also intend to put together a
function analogous to split_part which will take a string and a pattern to
split on, and return setof text.

Let me know if I should work under the assumption of the attached patch
and write the functions for contrib or pgfoundry, or to put the functions
in regexp.c and try to get them in core, or both? (it made my life a lot
easier working on the function to not have to restart the postmaster every
time I recompiled it, may be nice for the future to be able to make
extensions like this...)

To err is human, to forgive, beyond the scope of the Operating System.

Attachment: regexp_ext.c
Description: text/plain (1.8 KB)
Attachment: regexp-export.patch
Description: text/plain (9.3 KB)

In response to

pgsql-hackers by date

Next:From: Jim NasbyDate: 2007-02-02 03:46:48
Subject: Re: SQL to get a table columns comments?
Previous:From: ITAGAKI TakahiroDate: 2007-02-02 02:47:50
Subject: Re: Estimation error in n_dead_tuples

pgsql-patches by date

Next:From: Bruce MomjianDate: 2007-02-02 03:50:12
Subject: Re: Enums patch v2
Previous:From: ITAGAKI TakahiroDate: 2007-02-02 02:47:54
Subject: Error correction for n_dead_tuples

Privacy Policy | About PostgreSQL
Copyright © 1996-2018 The PostgreSQL Global Development Group