From: | code(at)phaedrusdeinus(dot)org |
---|---|
To: | pgsql-bugs(at)postgresql(dot)org |
Subject: | BUG #6381: Incorrect greediness behavior in certain regular expressions |
Date: | 2012-01-06 00:32:17 |
Message-ID: | E1RixjF-0006gP-Dt@wrigleys.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-bugs |
The following bug has been logged on the website:
Bug reference: 6381
Logged by: john melesky
Email address: code(at)phaedrusdeinus(dot)org
PostgreSQL version: 9.1.1
Operating system: x86_64-pc-linux-gnu
Description:
This simple regexp returns correctly (that is, (.*?) matches
'blahblah.com'):
=# select regexp_matches('http://blahblah.com/asdf',
'http://(.*?)(/|%2f|$)');
regexp_matches
------------------
{blahblah.com,/}
This, more complex/complete version, matches greedily, which is incorrect:
=# select regexp_matches('http://blahblah.com/asdf',
'http(s?)(:|%3a)(//|%2f%2f)(.*?)(/|%2f|$)');
regexp_matches
--------------------------------
{"",:,//,blahblah.com/asdf,""}
(That is, (.*?) matches 'blahblah.com/asdf')
The problem appears to be the inclusion of '$' in the final paren group. So,
this works:
select regexp_matches('http://blahblah.com/asdf',
'http(s?)(:|%3a)(//|%2f%2f)(.*?)(/|%2f)');
regexp_matches
--------------------------
{"",:,//,blahblah.com,/}
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2012-01-06 07:21:19 | Re: BUG #6381: Incorrect greediness behavior in certain regular expressions |
Previous Message | David Fetter | 2012-01-05 16:53:40 | Re: Proble Postgre SQL version 7.4.1 |