Re: Another regexp performance improvement: skip useless paren-captures

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Joel Jacobson <joel(at)compiler(dot)org>
Subject: Re: Another regexp performance improvement: skip useless paren-captures
Date: 2021-08-05 14:36:21
Message-ID: 2390507.1628174181@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> I'm a bit worried about how you'll keep track of back-ref numbering
> since back-refs only count capturing groups, and you're silently turning
> a capturing group into a non-capturing group.

They're already numbered at this point, and we aren't changing the numbers
of the capturing groups that remain live. There will be unused entries in
the regmatch_t array at runtime (corresponding to the zapped groups), but
that doesn't cost anything worth mentioning.

Now that you mention it, I am not sure whether there are any regression
test cases that specifically cover still being able to match \2 when
the first capture group went away. Probably should add more cases...

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2021-08-05 14:38:04 Re: very long record lines in expanded psql output
Previous Message Robert Haas 2021-08-05 14:36:14 Re: Commitfest overflow