Skip site navigation (1) Skip section navigation (2)

Re: writing new regexp functions

From: David Fetter <david(at)fetter(dot)org>
To: Jeremy Drake <pgsql(at)jdrake(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>,PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: writing new regexp functions
Date: 2007-02-02 06:55:18
Message-ID: 20070202065518.GF3882@fetter.org (view raw or flat)
Thread:
Lists: pgsql-hackerspgsql-patches
On Thu, Feb 01, 2007 at 10:16:54PM -0800, Jeremy Drake wrote:
> On Thu, 1 Feb 2007, David Fetter wrote:
> 
> > On Thu, Feb 01, 2007 at 05:11:30PM -0800, Jeremy Drake wrote:
> > > Anyway, the particular thing I was writing was a function like
> > > substring(str FROM pattern) which instead of returning just the
> > > first match group, would return an array of text containing all
> > > of the match groups.
> 
> If you are subscribed to -patches, I sent my code to date there
> earlier this evening.  I also said that I wanted to make a function
> that split on a pattern (like perl split) and returned setof text.
> 
> > That'd be great!  People who use dynamic languages like Perl would
> > feel much more at home having access to all the matches.  While
> > you're at it, could you could make pre-match and post-match
> > (optionally--I know it's expensive) available?
> 
> I could, but I'm not sure how someone would go about accessing such
> a thing.  What I just wrote would be most like this perl: @foo =
> ($str=~/pattern/);

> Where would pre and post match fit into this?  Are you talking about a
> different function?

Yes, although it might have the same name, as in regex_match(pattern
TEXT, string TEXT, return_pre_and_post BOOL).

> Or sticking prematch at the beginning of the array and postmatch at
> the end?  I could also put the whole match somewhere also, but I did
> not in this version.

The data structure could be something like

TYPE matches (
    prematch TEXT,
     match    TEXT[],
     postmatch TEXT
)

> The code I wrote returns a text[] which is one-dimensional, has a lower
> bound of 1 (as most postgres arrays do), where if there are n capture
> groups, ra[1] has the first capture group and ra[n] has the last one.
> Since postgres has an option to make different lower bounds, I suppose I
> could have an option to put the prematch in [-1], the entire match in [0],
> and the postmatch in [n+1].  This seems to be odd to me though.

Odd == bad.  I think the pre- and post-matches should be different in
essence, not just in index :)

> I guess I'm saying, I agree that the entire match, prematch, and postmatch
> would be helpful, but how would you propose to present these to the user?

See above :)

Cheers,
D
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
phone: +1 415 235 3778        AIM: dfetter666
                              Skype: davidfetter

Remember to vote!

In response to

Responses

pgsql-hackers by date

Next:From: Tom LaneDate: 2007-02-02 07:17:51
Subject: Re: Function proposal to find the type of a datum
Previous:From: Pavel StehuleDate: 2007-02-02 06:49:18
Subject: Re: Function proposal to find the type of a datum

pgsql-patches by date

Next:From: Jeremy DrakeDate: 2007-02-02 08:15:15
Subject: Re: writing new regexp functions
Previous:From: Jeremy DrakeDate: 2007-02-02 06:16:54
Subject: Re: writing new regexp functions

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group