Re: Regex back-reference semantics and performance

From: "Joel Jacobson" <joel(at)compiler(dot)org>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Regex back-reference semantics and performance
Date: 2021-03-01 11:13:56
Message-ID: d80e1689-a3e4-4c86-a3e8-d22a0ac516fe@www.fastmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 1, 2021, at 01:53, Tom Lane wrote:
>0001-fix-backref-semantics.patch
>0002-backref-performance-hack.patch

I've successfully tested both patches.

On HEAD the trouble-query took forever, I cancelled it after 23 minutes.

HEAD (f5a5773a9dc4185414fe538525e20d8512c2ba35):
SELECT regexp_matches(subject, pattern, 'g') FROM trouble;
^CCancel request sent
ERROR: canceling statement due to user request
Time: 1387398.764 ms (23:07.399)

HEAD + 0001 + 0002:
SELECT regexp_matches(subject, pattern, 'g') FROM trouble;
Time: 24.943 ms
Time: 22.217 ms
Time: 20.250 ms

Very nice!

I also verified the patches gave the same result for the performance_test:

SELECT
is_match <> (subject ~ pattern) AS is_match_diff,
captured IS DISTINCT FROM regexp_match(subject, pattern, flags) AS captured_diff,
COUNT(*)
FROM performance_test
GROUP BY 1,2
ORDER BY 1,2
;

is_match_diff | captured_diff | count
---------------+---------------+---------
f | f | 3360068
(1 row)

No notable timing differences:

HEAD (f5a5773a9dc4185414fe538525e20d8512c2ba35)
Time: 97016.668 ms (01:37.017)
Time: 96945.567 ms (01:36.946)
Time: 95261.263 ms (01:35.261)

HEAD + 0001:
Time: 97165.302 ms (01:37.165)
Time: 96327.836 ms (01:36.328)
Time: 96295.643 ms (01:36.296)

HEAD + 0001 + 0002:
Time: 96447.527 ms (01:36.448)
Time: 94262.288 ms (01:34.262)
Time: 95331.483 ms (01:35.331)

/Joel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2021-03-01 11:54:49 Re: Fix DROP TABLESPACE on Windows with ProcSignalBarrier?
Previous Message Thomas Munro 2021-03-01 11:00:08 Re: PostmasterIsAlive() in recovery (non-USE_POST_MASTER_DEATH_SIGNAL builds)