Re: Another regexp performance improvement: skip useless paren-captures

From: Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Joel Jacobson <joel(at)compiler(dot)org>
Subject: Re: Another regexp performance improvement: skip useless paren-captures
Date: 2021-08-05 21:37:21
Message-ID: 2BFE7DFD-8537-4B4E-BC4C-2F265A836651@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Aug 5, 2021, at 7:36 AM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
> Probably should add more cases...

The patch triggers an assertion that master does not:

+select 'azrlfkjbjgidgryryiglcabkgqluflu' !~ '(.(.)((.)))((?:(\3)))';
+server closed the connection unexpectedly
+ This probably means the server terminated abnormally
+ before or while processing the request.
+connection to server was lost

The relevant portion of the stack trace:

frame #3: 0x00000001043bcf6d postgres`ExceptionalCondition(conditionName=<unavailable>, errorType=<unavailable>, fileName=<unavailable>, lineNumber=<unavailable>) at assert.c:69:2 [opt]
frame #4: 0x000000010410168b postgres`cdissect(v=0x00007ffeebdd2ad8, t=0x00007f863cd055b0, begin=0x00007f863d821528, end=0x00007f863d82152c) at regexec.c:767:4 [opt]
frame #5: 0x000000010410129b postgres`cdissect [inlined] ccondissect(v=<unavailable>, t=<unavailable>, begin=0x00007f863d821524, end=<unavailable>) at regexec.c:835:10 [opt]
frame #6: 0x000000010410123d postgres`cdissect(v=0x00007ffeebdd2ad8, t=0x00007f863cd05430, begin=0x00007f863d821524, end=0x00007f863d82152c) at regexec.c:752 [opt]
frame #7: 0x000000010410129b postgres`cdissect [inlined] ccondissect(v=<unavailable>, t=<unavailable>, begin=0x00007f863d821520, end=<unavailable>) at regexec.c:835:10 [opt]
frame #8: 0x000000010410123d postgres`cdissect(v=0x00007ffeebdd2ad8, t=0x00007f863cd050f0, begin=0x00007f863d821520, end=0x00007f863d82152c) at regexec.c:752 [opt]
frame #9: 0x0000000104101282 postgres`cdissect [inlined] ccondissect(v=<unavailable>, t=<unavailable>, begin=0x00007f863d821520, end=<unavailable>) at regexec.c:832:9 [opt]
frame #10: 0x000000010410123d postgres`cdissect(v=0x00007ffeebdd2ad8, t=0x00007f863cd04ff0, begin=0x00007f863d821520, end=0x00007f863d821530) at regexec.c:752 [opt]
frame #11: 0x00000001040ff508 postgres`pg_regexec [inlined] cfindloop(v=<unavailable>, cnfa=<unavailable>, cm=<unavailable>, d=0x00007ffeebdd6d68, s=0x00007ffeebdd2b48, coldp=<unavailable>) at regexec.c:600:10 [opt]
frame #12: 0x00000001040ff36b postgres`pg_regexec [inlined] cfind(v=0x000000010459c5f8, cnfa=<unavailable>, cm=<unavailable>) at regexec.c:515 [opt]
frame #13: 0x00000001040ff315 postgres`pg_regexec(re=<unavailable>, string=<unavailable>, len=140732855577960, search_start=<unavailable>, details=<unavailable>, nmatch=0, pmatch=0x0000000000000000, flags=0) at regexec.c:293 [opt]
frame #14: 0x0000000104244d61 postgres`RE_wchar_execute(re=<unavailable>, data=<unavailable>, data_len=<unavailable>, start_search=<unavailable>, nmatch=<unavailable>, pmatch=<unavailable>) at regexp.c:274:19 [opt]
frame #15: 0x0000000104242c80 postgres`textregexne [inlined] RE_execute(dat=<unavailable>, dat_len=31, nmatch=0, pmatch=0x0000000000000000) at regexp.c:322:10 [opt]
frame #16: 0x0000000104242c50 postgres`textregexne [inlined] RE_compile_and_execute(text_re=<unavailable>, dat=<unavailable>, dat_len=31, cflags=19, collation=<unavailable>, nmatch=0, pmatch=<unavailable>) at regexp.c:357 [opt]


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2021-08-05 22:15:27 Re: Assert triggered during RE_compile_and_cache
Previous Message Mark Dilger 2021-08-05 20:49:00 Re: Assert triggered during RE_compile_and_cache