Re: Interrupts vs signals

From: Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Interrupts vs signals
Date: 2026-01-29 01:05:33
Message-ID: b88a764a-b2e4-4dac-9d25-7249215db2b7@iki.fi
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Here's a new rebased and massively re-worked patch.

The patches are split differently than in the previous version:
- Patches 0001-0005 are just refactoring around recovery conflicts which
I posted on a separate thread [1]. Please review and comment there.
- Patches 0006 and 0007 are small cleanups that could be applied early
(after updating the docs, as noted in TODO comment there).
- All the interesting bits for this thread are in the last, massive patch.

To review this, I suggest starting from the new
src/backend/ipc/README.md file. It gives a good overview of the
mechanism (or if it doesn't, that's valuable feedback :-) ). It also
contains a bunch of Open Questions at the bottom; I'd love to hear
opinions and ideas on those.

Wrt. the issue of running out of bits: My current plan is to just switch
to 64-bit interrupt masks. That gives a fair amount of headroom. Would
be nice to have even more headroom, but I think 64 bits will be enough
for a long time.

The big high-level design change in this version is to address these issues:

On 15/07/2025 19:50, Andres Freund wrote:
> I'm not really in love how this looks like yet, tbh. Aspects of it seem to buy
> further into some of our already existing design mistakes. I have two main
> concerns:
>
> 1) It seems rather bonkers that we have to go to every process type and add
> code for INTERRUPT_LOG_MEMORY_CONTEXT or INTERRUPT_CONFIG_RELOAD.
>
> 2) The fact that a lot of code specifies explicit interrupt masks makes that
> code harder to compose, because whether we can process some interrupt or not
> depends on higher level state. That's e.g. why INTERRUPT_CFI_MASK() has to be
> a bit magic and decid ewhat interrupt mask to use and why we need a separate
> interrupt handling functions for interrupts happening while waiting for
> network IO than normally.
>
>
> For 1), I suspect that we should have a central mapping array between
> InterruptTypes and the code to handle each of the interrupt types. That
> mapping table also could serve as the registry for extensions to register
> their own interrupt ids.

That's a great idea, and that's what this patch now does. You can now
register handler functions for interrupts which will then be called by
CHECK_FOR_INTERRUPTS() whenever the corresponding interrupt is pending.
ProcessInterrupts has no knowledge about what each interrupt means, it
just calls the handler functions in an array.

At backend startup, a "default" set of interrupt handlers are installed,
like this: (a shortened excerpt from SetStandardInterruptHandlers())

/* All processes should react to barriers and memory context debugging */
SetInterruptHandler(INTERRUPT_BARRIER, ProcessProcSignalBarrier);
EnableInterrupt(INTERRUPT_BARRIER);
SetInterruptHandler(INTERRUPT_LOG_MEMORY_CONTEXT,
ProcessLogMemoryContextInterrupt);
EnableInterrupt(INTERRUPT_LOG_MEMORY_CONTEXT);

/*
* Every process should react to INTERRUPT_TERMINATE. But many processes
* disable this and do their own checks at appropriate times.
*/
SetInterruptHandler(INTERRUPT_TERMINATE, ProcessTerminateInterrupt);
EnableInterrupt(INTERRUPT_TERMINATE);

...

Many processes have additional interrupts, or want different handlers
for the standard interrupts. They make additional SetInterruptHandler()
calls to set up their process-specific handlers.

This largely solves problems 1) and 2). I replaced the
INTERRUPT_CFI_MASK() macro with a global variable
CheckForInterruptsMask. At all times, it is a bitmask of all the
interrupts that CHECK_FOR_INTERRUPTS() would process and clear. Even
HOLD/RESUME_INTERRUPTS() now updates CheckForInterruptsMask (setting it
to zero in HOLD_INTERRUPTS(), and restoring it on RESUME_INTERRUPTS())

With that facility, a lot of things got nicer. For example, I got rid of
ProcessClientReadInterrupt() altogether. secure_read() now calls plain
old CHECK_FOR_INTERRUPTS(), and it is the caller's responsibility to
enable/disable the interrupt handlers according to what should or should
not be processed. (ProcessClientWriteInterrupt() is gone too, although
now that I look at it, I wonder if that went too far and we should still
refrain from processing interrupts other than INTERRUPT_TERMINATE while
we're sending to the client...)

I'd love to get feedback on this mechanism. And on the patch in general,
of course, but this is the big new part.

> For 2), I wonder if we ought to have a global mask of interrupt kinds that can
> be processed in some context. Instead of having INTERRUPT_CFI_MASK() compute
> what mask to use, we could have things like HOLD_CANCEL_INTERRUPTS be defined
> as something like
>
> if (InterruptHoldCount[CANCEL]++ == 0)
> InterruptMask &= ~CANCEL;
>
> which would allow CHECK_FOR_INTERRUPTS to just use InterruptMask to check for
> to-be-processed interrupts.

I played with that, and it's quite sensible, but then I realized that we
only use HOLD_CANCEL_INTERRUPTS() in a few places, namely when we are
reading a message from the client. We don't really need the nesting
capability, i.e. we don't need a counter, a boolean is enough.

On 15/07/2025 18:50, Andres Freund wrote:
>> @@ -205,6 +206,8 @@ StartupProcExit(int code, Datum arg)
>> /* Shutdown the recovery environment */
>> if (standbyState != STANDBY_DISABLED)
>> ShutdownRecoveryTransactionEnvironment();
>> +
>> + ProcGlobal->startupProc = INVALID_PROC_NUMBER;
>> }
>
>
> What if we instead had a ProcGlobal->auxProc[auxproxtype]? We have different
> versions of this for different types auf aux processes, which doesn't really
> make sense.

I like that idea, but didn't try it yet.

>> + * Perform a plain atomic read first as a fast path for the case that
>> + * an interrupt is already pending.
>> */
>> - if (set->latch && !set->latch->is_set)
>> + old_mask = pg_atomic_read_u32(MyPendingInterrupts);
>> + already_pending = ((old_mask & set->interrupt_mask) != 0);
>> +
>> + if (!already_pending)
>> {
>
>
> Hm, I think this may be incorrect - what if there is *some* overlap between
> MyPendingInterrupts and set->interrupt_mask, but they're not equal?

I don't see the problem. In that case, already_pending will be set to
'true', as it should.

>> +/*
>> + * Test an interrupt flag (of flags).
>> + */
>> +static inline bool
>> +IsInterruptPending(uint32 interruptMask)
>> +{
>> + return (pg_atomic_read_u32(MyPendingInterrupts) & interruptMask) != 0;
>> +}
>
>
> Do we need to care about memory ordering?

Hmm, I think not. If you imagine using this with WaitInterrupt(), the
WaitInterrupt works as a synchronization point. If the interrupt was
just set by another procss, but this function returns a stale "false",
the next WaitInterrupt() will return without sleeping. No other backend
should be clearing our pending interrupt bits, so a stale "true" is not
possible. I'll add a comment.

>> There's still a general-purpose INTERRUPT_GENERAL interrupt that is
>> multiplexed for many different purposes in different processes, for
>> example to wake up the walwriter when it has work to do, and to wake
>> up processes waiting on a condition variable. The common property of
>> those uses is that there's some other flag or condition that is
>> checked on each wakeup, the wakeup interrupt itself means merely that
>> something interesting might've happened.
>
> I guess extensions that just need to use INTERRUPT_GENERAL, given that new
> interrupt reasons can't be dynamically registered?

Correct. I plan to implement a dynamic interrupt allocation mechanism
for extensions, but it's not implemented yet.

(btw I renamed INTERRUPT_GENERAL to INTERRUPT_WAIT_WAKEUP)

>> + if (ConsumeInterrupt(INTERRUPT_CONFIG_RELOAD))
>> {
>> int autovacuum_max_workers_prev = autovacuum_max_workers;
>>
>> - ConfigReloadPending = false;
>> ProcessConfigFile(PGC_SIGHUP);
>>
>> /* shutdown requested in config file? */
>> @@ -771,11 +774,11 @@ ProcessAutoVacLauncherInterrupts(void)
>> }
>>
>> /* Process barrier events */
>> - if (ProcSignalBarrierPending)
>> + if (IsInterruptPending(INTERRUPT_BARRIER))
>> ProcessProcSignalBarrier();
>
>
> Why do we have some code immediately consuming and others just checking if an
> interrupt is pending?

I was just sloppy. Now it's consistent: ProcessInterrupts() now always
clears the interrupt before calling the handler function, and the
handler function does not check or clear the interrupt.

>> @@ -572,10 +574,18 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
>> cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs);
>> }
>>
>> - (void) WaitInterrupt(INTERRUPT_GENERAL,
>> - WL_INTERRUPT | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
>> - cur_timeout * 1000L /* convert to ms */ ,
>> - WAIT_EVENT_CHECKPOINTER_MAIN);
>> + (void) WaitInterrupt(
>> + /* these are handled in the main loop */
>> + INTERRUPT_GENERAL | /* checkpoint requested */
>
>
> Why is a checkpoint request done via GENERAL? Seems that's specifically one
> that we do *not* want to absorb via CFI?

It works fine because there's a separate shared memory flag that's set
when checkpoint is requested. We check that flag in the main loop, even
if the interrupt is absorbed.

>
>> @@ -947,6 +953,7 @@ void
>> StandbyDeadLockHandler(void)
>> {
>> got_standby_deadlock_timeout = true;
>> + RaiseInterrupt(INTERRUPT_GENERAL);
>> }
>>
>> /*
>> @@ -956,6 +963,7 @@ void
>> StandbyTimeoutHandler(void)
>> {
>> got_standby_delay_timeout = true;
>> + RaiseInterrupt(INTERRUPT_GENERAL);
>> }
>>
>> /*
>> @@ -965,6 +973,7 @@ void
>> StandbyLockTimeoutHandler(void)
>> {
>> got_standby_lock_timeout = true;
>> + RaiseInterrupt(INTERRUPT_GENERAL);
>> }
>
>
> Why are these using INTERRUPT_GENERAL?

Could be changed for sure, but this works too. These are quite localized.

>> +
>> + if (ConsumeInterrupt(INTERRUPT_WALSND_INIT_STOPPING))
>> + ProcessWalSndInitStopping();
>
>
> Huh. So ProcessWalSndInitStopping() raises different inerrupts to actually
> exit and thus relies on being called first? That seems rather clunky.

I don't know about the "being called first" part, but yeah this is
clunky. I had hard time understanding the logic and I'm still not 100% I
got it right to be honest. Some further cleanup for readability would be
nice..

[1]
https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi

- Heikki

Attachment Content-Type Size
v8-0001-Remove-useless-errdetail_abort.patch text/x-patch 11.9 KB
v8-0002-Don-t-hint-that-you-can-reconnect-when-the-databa.patch text/x-patch 2.0 KB
v8-0003-Use-ProcNumber-rather-than-pid-in-ReplicationSlot.patch text/x-patch 11.1 KB
v8-0004-Separate-RecoveryConflictReasons-from-procsignals.patch text/x-patch 37.7 KB
v8-0005-Refactor-ProcessRecoveryConflictInterrupt-for-rea.patch text/x-patch 15.6 KB
v8-0006-Centralize-resetting-SIGCHLD-handler.patch text/x-patch 9.7 KB
v8-0007-Use-standard-die-handler-for-SIGTERM-in-bgworkers.patch text/x-patch 2.0 KB
v8-0008-Replace-Latches-with-Interrupts.patch text/x-patch 508.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message jian he 2026-01-29 01:18:49 ERROR: failed to find conversion function from unknown to text
Previous Message surya poondla 2026-01-29 00:56:19 Re: Add comments about fire_triggers argument in ri_triggers.c