Re: regexp_matches() quantified-capturing-parentheses oddity

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Julian Mehnle <julian(at)mehnle(dot)net>
Cc: pgsql-general(at)postgresql(dot)org
Subject: Re: regexp_matches() quantified-capturing-parentheses oddity
Date: 2009-12-08 16:49:34
Message-ID: 13289.1260290974@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-general

Julian Mehnle <julian(at)mehnle(dot)net> writes:
> So far, so good. However, can someone please explain the following to me?
> wisu-dev=# SELECT regexp_matches('quux(at)foo@bar.zip', '([(at)(dot)]|[^(at)(dot)]+)+', 'g');
> wisu-dev=# SELECT regexp_matches('quux(at)foo@bar.zip', '([(at)(dot)]|[^(at)(dot)]+){1,2}', 'g');
> wisu-dev=# SELECT regexp_matches('quux(at)foo@bar.zip', '([(at)(dot)]|[^(at)(dot)]+){1,3}', 'g');

These might be a bug, but the behavior doesn't seem to me that it'd be
terribly well defined in any case. The function should be pulling the
match to the parenthesized subexpression, but here that subexpression
has got multiple matches --- which one would you expect to get?

Instead of (foo)+ I'd try
((foo+)) if you want all the matches
(foo)(foo)* if you want the first one
(?:foo)*(foo) if you want the last one

regards, tom lane

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Andreas 'ads' Scherbaum 2009-12-08 17:07:09 Re: PostgreSQL@FOSDEM 2010 - HOTEL room reservation
Previous Message Merlin Moncure 2009-12-08 16:19:59 Re: Question on "best practise" for SELECTS on inherited tables