Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Noah Misch <noah(at)leadboat(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is RecoveryConflictInterrupt() entirely safe in a signal handler?
Date: 2023-01-04 03:46:05
Message-ID: CA+hUKGLNOFKfAG5ETuOEEzukgmOFGsiunfL2AfZmKhWx8EJE3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Jan 2, 2023 at 8:38 AM Thomas Munro <thomas(dot)munro(at)gmail(dot)com> wrote:
> On Sat, Dec 31, 2022 at 6:36 PM Noah Misch <noah(at)leadboat(dot)com> wrote:
> > On Sat, Dec 31, 2022 at 10:06:53AM +1300, Thomas Munro wrote:
> > > Idea #8 is a realisation that twisting oneself into a pretzel to avoid
> > > having to change the regexp code or its REG_CANCEL control flow may be
> > > a bit silly. If the only thing it really needs to do is free some
> > > memory, maybe the regexp module should provide a function that frees
> > > everything that is safe to call from our rcancelrequested callback, so
> > > we can do so before we longjmp back to Kansas. Then the REG_CANCEL
> > > code paths would be effectively unreachable in PostgreSQL. I don't
> > > know if it's better or worse than your idea #6, "use palloc instead,
> > > it already has garbage collection, duh", but it's a different take on
> > > the same realisation that this is just about free().
> >
> > PG_TRY() isn't free, so it's nice that (6) doesn't add one. If (6) fails in
> > some not-yet-revealed way, (8) could get more relevant.
> >
> > > I guess idea #6 must be pretty easy to try: just point that MALLOC()
> > > macro to palloc(), and do a plain old CFI() in rcancelrequested().
>
> It's not quite so easy: in RE_compile_and_cache we construct objects
> with arbitrary cache-managed lifetime, which suggests we need a cache
> memory context, but we could also fail mid construction, which
> suggests we'd need a dedicated per-regex object memory context that is
> made permanent with the MemoryContextSetParent() trick (as we see
> elsewhere for cached things that are constructed by code that might
> throw), ...

Here's an experiment-grade attempt at idea #6 done that way, for
discussion. You can see how much memory is wasted by each regex_t,
which I guess is probably on the order of a couple of hundred KB if
you use all 32 regex cache slots using ALLOCSET_SMALL_SIZES as I did
here:

postgres=# select 'x' ~ 'hello world .*';
-[ RECORD 1 ]
?column? | f

postgres=# select * from pg_backend_memory_contexts where name =
'RegexpMemoryContext';
-[ RECORD 1 ]-+-------------------------
name | RegexpMemoryContext
ident | hello world .*
parent | RegexpCacheMemoryContext
level | 2
total_bytes | 13376
total_nblocks | 5
free_bytes | 5144
free_chunks | 8
used_bytes | 8232

There's some more memory allocated in regc_pg_locale.c with raw
malloc() that could probably benefit from a pallocisation just to be
able to measure it, but I didn't touch that here.

The recovery conflict patch itself is unchanged, except that I removed
the claim in the commit message that this would be back-patched. It's
pretty clear that this would need to spend a decent amount of time on
master only.

Attachment Content-Type Size
v4-0001-Use-MemoryContext-API-for-regexp-memory-managemen.patch text/x-patch 8.7 KB
v4-0002-Fix-recovery-conflict-SIGUSR1-handling.patch text/x-patch 18.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2023-01-04 03:56:36 Re: pgsql: Delay commit status checks until freezing executes.
Previous Message David Rowley 2023-01-04 03:41:42 Re: [PATCH] Improve ability to display optimizer analysis using OPTIMIZER_DEBUG