BUG #16133: Regexp quantifier issues

From: PG Bug reporting form <noreply(at)postgresql(dot)org>
To: pgsql-bugs(at)lists(dot)postgresql(dot)org
Cc: andrew(at)tao11(dot)riddles(dot)org(dot)uk
Subject: BUG #16133: Regexp quantifier issues
Date: 2019-11-22 20:08:55
Message-ID: 16133-a8934caee4e53035@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

The following bug has been logged on the website:

Bug reference: 16133
Logged by: Andrew Gierth
Email address: andrew(at)tao11(dot)riddles(dot)org(dot)uk
PostgreSQL version: 12.1
Operating system: any
Description:

(This started out as an irc discussion on #tcl that spilled over to
#postgresql:)

SELECT regexp_match('aaa', '(a*)*');
regexp_match
--------------
{aaa}
(1 row)

SELECT regexp_match('aaa', '(a*)+');
regexp_match
--------------
{""}
(1 row)

What seems to be happening here is that in the + case, the engine is doing
one more match, matching (a*) against an empty string at the end of the
input, unlike the * case where the last match of (a*) is against the whole
string. This seems to violate the rules for determining where subexpression
captures line up. (And certainly there is no justification for the + vs. *
quantifier to make any difference here.)

There are a large number of similar cases, but this seems to be the common
factor to all of them so far.

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tom Lane 2019-11-22 20:34:47 Re: BUG #16133: Regexp quantifier issues
Previous Message PG Bug reporting form 2019-11-22 18:50:14 BUG #16132: PostgreSQL 12.1 and PLV8 2.3.13 => PostgreSQL crashes