Quick Links

Re: Pathological regexp match

From:	Alvaro Herrera <alvherre(at)commandprompt(dot)com>
To:	Michael Glaesemann <michael(dot)glaesemann(at)myyearbook(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: Pathological regexp match
Date:	2010-01-29 03:58:46
Message-ID:	20100129035846.GE1793@alvh.no-ip.org
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Michael Glaesemann wrote:
>
> On Jan 28, 2010, at 21:59 , Alvaro Herrera wrote:
>
> >Hi Michael,
> >
> >Michael Glaesemann wrote:
> >>We came across a regexp that takes very much longer than expected.
> >>
> >>PostgreSQL 8.4.1 on x86_64-unknown-linux-gnu, compiled by GCC
> >>gcc (GCC) 4.1.2 20080704 (Red Hat 4.1.2-44), 64-bit
> >>
> >>SELECT 'ooo...' ~ $r$Z(Q)[^Q]*A.*?(\1)$r$; -- omitted for email
> >>brevity
> >
> >The ? after .* is pointless.
>
> Interesting. I would expect that *? would be the non-greedy version
> of *, meaning match up to the first \1 (in this case the first Q
> following A), rather than as much as possible.

Huh, you are right, *? is the non-greedy version. I keep forgetting
those. Note that they only work if you have regex_flavor set to
advanced, though (which is the default).

> However, as you point out, Postgres doesn't appear to take this into
> account:
>
> postgres=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q)
> [^Q]*A.*(\2))$r$, $s$X$s$);
> regexp_replace
> ----------------
> oooXooo
> (1 row)
>
> postgres=# select regexp_replace('oooZQoooAoooQooQooQooo', $r$(Z(Q)
> [^Q]*A.*?(\2))$r$, $s$X$s$);
> regexp_replace
> ----------------
> oooXooo
> (1 row)

Hmm, that's strange ...

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

In response to

Re: Pathological regexp match at 2010-01-29 03:37:39 from Michael Glaesemann

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andrew Dunstan	2010-01-29 04:02:23	Re: Add on_perl_init and proper destruction to plperl UPDATED [PATCH]
Previous Message	Michael Glaesemann	2010-01-29 03:37:39	Re: Pathological regexp match