Re: Using WaitEventSet in the postmaster

From: Andres Freund <andres(at)anarazel(dot)de>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using WaitEventSet in the postmaster
Date: 2022-12-02 01:40:22
Message-ID: 20221202014022.j4hwmjnysasdg5yn@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-12-02 10:12:25 +1300, Thomas Munro wrote:
> Here's a work-in-progress patch that uses WaitEventSet for the main
> event loop in the postmaster

Wee!

> with a latch as the wakeup mechanism for "PM signals" (requests from
> backends to do things like start a background worker, etc).

Hm - is that directly related? ISTM that using a WES in the main loop, and
changing pmsignal.c to a latch are somewhat separate things?

Using a latch for pmsignal.c seems like a larger lift, because it means that
all of latch.c needs to be robust against a corrupted struct Latch.

> In order to avoid adding a new dependency on the contents of shared
> memory, I introduced SetLatchRobustly() that will always use the slow
> path kernel wakeup primitive, even in cases where SetLatch() would
> not. The idea here is that if one backend trashes shared memory,
> others backends can still wake the postmaster even though it may
> appear that the postmaster isn't waiting or the latch is already set.

Why is that a concern that needs to be addressed?

ISTM that the important thing is that either a) the postmaster's latch can't
be corrupted, because it's not shared with backends or b) struct Latch can be
overwritten with random contents without causing additional problems in
postmaster.

I don't think b) is the case as the patch stands. Imagine some process
overwriting pm_latch->owner_pid. That'd then break the SetLatch() in
postmaster's signal handler, because it wouldn't realize that itself needs to
be woken up, and we'd just signal some random process.

It doesn't seem trivial (but not impossible either) to make SetLatch() robust
against arbitrary corruption. So it seems easier to me to just put the latch
in process local memory, and do a SetLatch() in postmaster's SIGUSR1 handler.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-12-02 01:42:25 Re: Using AF_UNIX sockets always for tests on Windows
Previous Message Tom Lane 2022-12-02 01:30:36 Re: Using AF_UNIX sockets always for tests on Windows