Re: Using WaitEventSet in the postmaster

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using WaitEventSet in the postmaster
Date: 2022-12-02 02:36:03
Message-ID: CA+hUKGK8PO8JLD1sgGHz1xePJrvGp9bnaGXM5QemcA1MA8fKMQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Dec 2, 2022 at 2:40 PM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2022-12-02 10:12:25 +1300, Thomas Munro wrote:
> > with a latch as the wakeup mechanism for "PM signals" (requests from
> > backends to do things like start a background worker, etc).
>
> Hm - is that directly related? ISTM that using a WES in the main loop, and
> changing pmsignal.c to a latch are somewhat separate things?

Yeah, that's a good question. This comes from a larger patch set
where my *goal* was to use latches everywhere possible for
interprocess wakeups, but it does indeed make a lot of sense to do the
postmaster WaitEventSet retrofit completely independently of that, and
leaving the associated robustness problems for later proposals (the
posted patch clearly fails to solve them).

> I don't think b) is the case as the patch stands. Imagine some process
> overwriting pm_latch->owner_pid. That'd then break the SetLatch() in
> postmaster's signal handler, because it wouldn't realize that itself needs to
> be woken up, and we'd just signal some random process.

Right. At some point I had an idea about a non-shared table of
latches where OS-specific things like pids and HANDLEs live, so only
the maybe_waiting and is_set flags are in shared memory, and even
those are ignored when accessing the latch in 'robust' mode (they're
only optimisations after all). I didn't try it though. First you
might have to switch to a model with a finite set of latches
identified by index, or something like that. But I like your idea of
separating that whole problem.

> It doesn't seem trivial (but not impossible either) to make SetLatch() robust
> against arbitrary corruption. So it seems easier to me to just put the latch
> in process local memory, and do a SetLatch() in postmaster's SIGUSR1 handler.

Alright, good idea, I'll do a v2 like that.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Zheng Li 2022-12-02 02:46:25 Re: Support logical replication of DDLs
Previous Message Amin 2022-12-02 02:17:31 Traversing targetlist to find accessed columns