Re: Another regexp performance improvement: skip useless paren-captures

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Joel Jacobson <joel(at)compiler(dot)org>
Subject: Re: Another regexp performance improvement: skip useless paren-captures
Date: 2021-08-10 02:20:33
Message-ID: 3756810.1628562033@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> writes:
> I ran a lot of tests with the patch, and the asserts have all cleared up, but I don't know how to think about the user facing differences. If we had a good reason for raising an error for these back-references, maybe that'd be fine, but it seems to just be an implementation detail.

I thought about this some more, and I'm coming around to the idea that
throwing an error is the wrong thing. As a contrary example, consider

(.)|(\1\1)

We don't throw an error for this, and neither does Perl, even though
the capturing parens can never be defined in the branch where the
backrefs are. So it seems hard to argue that this is okay but the
other thing isn't. Another interesting example is

(.){0}(\1){0}

I think that the correct interpretation is that this is a valid
regexp matching an empty string (i.e., zero repetitions of each
part), even though neither capture group will be defined.
That's different from

(.){0}(\1)

which can never match.

So I took another look at the code, and it doesn't seem that hard
to make it act this way. The attached passes regression, but
I've not beat on it with random strings.

regards, tom lane

Attachment Content-Type Size
alternate-fix-zero-quantified-nested-parens.patch text/x-diff 2.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2021-08-10 02:35:17 Re: [BUG] wrong refresh when ALTER SUBSCRIPTION ADD/DROP PUBLICATION
Previous Message Peter Smith 2021-08-10 02:17:07 Re: Small documentation improvement for ALTER SUBSCRIPTION