Re: Another regexp performance improvement: skip useless paren-captures

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Joel Jacobson <joel(at)compiler(dot)org>
Subject: Re: Another regexp performance improvement: skip useless paren-captures
Date: 2021-08-08 20:25:00
Message-ID: 3224306.1628454300@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com> writes:
> Hmm. This changes the behavior when applied against master (c1132aae336c41cf9d316222e525d8d593c2b5d2):

> select regexp_split_to_array('uuuzkodphfbfbfb', '((.))(\1\2)', 'ntw');
> regexp_split_to_array
> -----------------------
> - {"",zkodphfbfbfb}
> + {uuuzkodphfbfbfb}
> (1 row)

Ugh. The regex engine is finding the match correctly, but it's failing to
tell the caller where it is :-(. I was a little too cute in optimizing
the regmatch_t result-vector copying in pg_regexec, and forgot to ensure
that the overall match position would be reported.

Thanks for the testing!

regards, tom lane

Attachment Content-Type Size
optimize-useless-captures-3.patch text/x-diff 14.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Mark Dilger 2021-08-08 20:39:46 Re: Another regexp performance improvement: skip useless paren-captures
Previous Message Rahila Syed 2021-08-08 19:59:50 Re: Column Filtering in Logical Replication