| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Interrupts vs signals |
| Date: | 2026-01-29 01:05:33 |
| Message-ID: | b88a764a-b2e4-4dac-9d25-7249215db2b7@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Here's a new rebased and massively re-worked patch.
The patches are split differently than in the previous version:
- Patches 0001-0005 are just refactoring around recovery conflicts which
I posted on a separate thread [1]. Please review and comment there.
- Patches 0006 and 0007 are small cleanups that could be applied early
(after updating the docs, as noted in TODO comment there).
- All the interesting bits for this thread are in the last, massive patch.
To review this, I suggest starting from the new
src/backend/ipc/README.md file. It gives a good overview of the
mechanism (or if it doesn't, that's valuable feedback :-) ). It also
contains a bunch of Open Questions at the bottom; I'd love to hear
opinions and ideas on those.
Wrt. the issue of running out of bits: My current plan is to just switch
to 64-bit interrupt masks. That gives a fair amount of headroom. Would
be nice to have even more headroom, but I think 64 bits will be enough
for a long time.
The big high-level design change in this version is to address these issues:
On 15/07/2025 19:50, Andres Freund wrote:
> I'm not really in love how this looks like yet, tbh. Aspects of it seem to buy
> further into some of our already existing design mistakes. I have two main
> concerns:
>
> 1) It seems rather bonkers that we have to go to every process type and add
> code for INTERRUPT_LOG_MEMORY_CONTEXT or INTERRUPT_CONFIG_RELOAD.
>
> 2) The fact that a lot of code specifies explicit interrupt masks makes that
> code harder to compose, because whether we can process some interrupt or not
> depends on higher level state. That's e.g. why INTERRUPT_CFI_MASK() has to be
> a bit magic and decid ewhat interrupt mask to use and why we need a separate
> interrupt handling functions for interrupts happening while waiting for
> network IO than normally.
>
>
> For 1), I suspect that we should have a central mapping array between
> InterruptTypes and the code to handle each of the interrupt types. That
> mapping table also could serve as the registry for extensions to register
> their own interrupt ids.
That's a great idea, and that's what this patch now does. You can now
register handler functions for interrupts which will then be called by
CHECK_FOR_INTERRUPTS() whenever the corresponding interrupt is pending.
ProcessInterrupts has no knowledge about what each interrupt means, it
just calls the handler functions in an array.
At backend startup, a "default" set of interrupt handlers are installed,
like this: (a shortened excerpt from SetStandardInterruptHandlers())
/* All processes should react to barriers and memory context debugging */
SetInterruptHandler(INTERRUPT_BARRIER, ProcessProcSignalBarrier);
EnableInterrupt(INTERRUPT_BARRIER);
SetInterruptHandler(INTERRUPT_LOG_MEMORY_CONTEXT,
ProcessLogMemoryContextInterrupt);
EnableInterrupt(INTERRUPT_LOG_MEMORY_CONTEXT);
/*
* Every process should react to INTERRUPT_TERMINATE. But many processes
* disable this and do their own checks at appropriate times.
*/
SetInterruptHandler(INTERRUPT_TERMINATE, ProcessTerminateInterrupt);
EnableInterrupt(INTERRUPT_TERMINATE);
...
Many processes have additional interrupts, or want different handlers
for the standard interrupts. They make additional SetInterruptHandler()
calls to set up their process-specific handlers.
This largely solves problems 1) and 2). I replaced the
INTERRUPT_CFI_MASK() macro with a global variable
CheckForInterruptsMask. At all times, it is a bitmask of all the
interrupts that CHECK_FOR_INTERRUPTS() would process and clear. Even
HOLD/RESUME_INTERRUPTS() now updates CheckForInterruptsMask (setting it
to zero in HOLD_INTERRUPTS(), and restoring it on RESUME_INTERRUPTS())
With that facility, a lot of things got nicer. For example, I got rid of
ProcessClientReadInterrupt() altogether. secure_read() now calls plain
old CHECK_FOR_INTERRUPTS(), and it is the caller's responsibility to
enable/disable the interrupt handlers according to what should or should
not be processed. (ProcessClientWriteInterrupt() is gone too, although
now that I look at it, I wonder if that went too far and we should still
refrain from processing interrupts other than INTERRUPT_TERMINATE while
we're sending to the client...)
I'd love to get feedback on this mechanism. And on the patch in general,
of course, but this is the big new part.
> For 2), I wonder if we ought to have a global mask of interrupt kinds that can
> be processed in some context. Instead of having INTERRUPT_CFI_MASK() compute
> what mask to use, we could have things like HOLD_CANCEL_INTERRUPTS be defined
> as something like
>
> if (InterruptHoldCount[CANCEL]++ == 0)
> InterruptMask &= ~CANCEL;
>
> which would allow CHECK_FOR_INTERRUPTS to just use InterruptMask to check for
> to-be-processed interrupts.
I played with that, and it's quite sensible, but then I realized that we
only use HOLD_CANCEL_INTERRUPTS() in a few places, namely when we are
reading a message from the client. We don't really need the nesting
capability, i.e. we don't need a counter, a boolean is enough.
On 15/07/2025 18:50, Andres Freund wrote:
>> @@ -205,6 +206,8 @@ StartupProcExit(int code, Datum arg)
>> /* Shutdown the recovery environment */
>> if (standbyState != STANDBY_DISABLED)
>> ShutdownRecoveryTransactionEnvironment();
>> +
>> + ProcGlobal->startupProc = INVALID_PROC_NUMBER;
>> }
>
>
> What if we instead had a ProcGlobal->auxProc[auxproxtype]? We have different
> versions of this for different types auf aux processes, which doesn't really
> make sense.
I like that idea, but didn't try it yet.
>> + * Perform a plain atomic read first as a fast path for the case that
>> + * an interrupt is already pending.
>> */
>> - if (set->latch && !set->latch->is_set)
>> + old_mask = pg_atomic_read_u32(MyPendingInterrupts);
>> + already_pending = ((old_mask & set->interrupt_mask) != 0);
>> +
>> + if (!already_pending)
>> {
>
>
> Hm, I think this may be incorrect - what if there is *some* overlap between
> MyPendingInterrupts and set->interrupt_mask, but they're not equal?
I don't see the problem. In that case, already_pending will be set to
'true', as it should.
>> +/*
>> + * Test an interrupt flag (of flags).
>> + */
>> +static inline bool
>> +IsInterruptPending(uint32 interruptMask)
>> +{
>> + return (pg_atomic_read_u32(MyPendingInterrupts) & interruptMask) != 0;
>> +}
>
>
> Do we need to care about memory ordering?
Hmm, I think not. If you imagine using this with WaitInterrupt(), the
WaitInterrupt works as a synchronization point. If the interrupt was
just set by another procss, but this function returns a stale "false",
the next WaitInterrupt() will return without sleeping. No other backend
should be clearing our pending interrupt bits, so a stale "true" is not
possible. I'll add a comment.
>> There's still a general-purpose INTERRUPT_GENERAL interrupt that is
>> multiplexed for many different purposes in different processes, for
>> example to wake up the walwriter when it has work to do, and to wake
>> up processes waiting on a condition variable. The common property of
>> those uses is that there's some other flag or condition that is
>> checked on each wakeup, the wakeup interrupt itself means merely that
>> something interesting might've happened.
>
> I guess extensions that just need to use INTERRUPT_GENERAL, given that new
> interrupt reasons can't be dynamically registered?
Correct. I plan to implement a dynamic interrupt allocation mechanism
for extensions, but it's not implemented yet.
(btw I renamed INTERRUPT_GENERAL to INTERRUPT_WAIT_WAKEUP)
>> + if (ConsumeInterrupt(INTERRUPT_CONFIG_RELOAD))
>> {
>> int autovacuum_max_workers_prev = autovacuum_max_workers;
>>
>> - ConfigReloadPending = false;
>> ProcessConfigFile(PGC_SIGHUP);
>>
>> /* shutdown requested in config file? */
>> @@ -771,11 +774,11 @@ ProcessAutoVacLauncherInterrupts(void)
>> }
>>
>> /* Process barrier events */
>> - if (ProcSignalBarrierPending)
>> + if (IsInterruptPending(INTERRUPT_BARRIER))
>> ProcessProcSignalBarrier();
>
>
> Why do we have some code immediately consuming and others just checking if an
> interrupt is pending?
I was just sloppy. Now it's consistent: ProcessInterrupts() now always
clears the interrupt before calling the handler function, and the
handler function does not check or clear the interrupt.
>> @@ -572,10 +574,18 @@ CheckpointerMain(const void *startup_data, size_t startup_data_len)
>> cur_timeout = Min(cur_timeout, XLogArchiveTimeout - elapsed_secs);
>> }
>>
>> - (void) WaitInterrupt(INTERRUPT_GENERAL,
>> - WL_INTERRUPT | WL_TIMEOUT | WL_EXIT_ON_PM_DEATH,
>> - cur_timeout * 1000L /* convert to ms */ ,
>> - WAIT_EVENT_CHECKPOINTER_MAIN);
>> + (void) WaitInterrupt(
>> + /* these are handled in the main loop */
>> + INTERRUPT_GENERAL | /* checkpoint requested */
>
>
> Why is a checkpoint request done via GENERAL? Seems that's specifically one
> that we do *not* want to absorb via CFI?
It works fine because there's a separate shared memory flag that's set
when checkpoint is requested. We check that flag in the main loop, even
if the interrupt is absorbed.
>
>> @@ -947,6 +953,7 @@ void
>> StandbyDeadLockHandler(void)
>> {
>> got_standby_deadlock_timeout = true;
>> + RaiseInterrupt(INTERRUPT_GENERAL);
>> }
>>
>> /*
>> @@ -956,6 +963,7 @@ void
>> StandbyTimeoutHandler(void)
>> {
>> got_standby_delay_timeout = true;
>> + RaiseInterrupt(INTERRUPT_GENERAL);
>> }
>>
>> /*
>> @@ -965,6 +973,7 @@ void
>> StandbyLockTimeoutHandler(void)
>> {
>> got_standby_lock_timeout = true;
>> + RaiseInterrupt(INTERRUPT_GENERAL);
>> }
>
>
> Why are these using INTERRUPT_GENERAL?
Could be changed for sure, but this works too. These are quite localized.
>> +
>> + if (ConsumeInterrupt(INTERRUPT_WALSND_INIT_STOPPING))
>> + ProcessWalSndInitStopping();
>
>
> Huh. So ProcessWalSndInitStopping() raises different inerrupts to actually
> exit and thus relies on being called first? That seems rather clunky.
I don't know about the "being called first" part, but yeah this is
clunky. I had hard time understanding the logic and I'm still not 100% I
got it right to be honest. Some further cleanup for readability would be
nice..
[1]
https://www.postgresql.org/message-id/4cc13ba1-4248-4884-b6ba-4805349e7f39@iki.fi
- Heikki
| Attachment | Content-Type | Size |
|---|---|---|
| v8-0001-Remove-useless-errdetail_abort.patch | text/x-patch | 11.9 KB |
| v8-0002-Don-t-hint-that-you-can-reconnect-when-the-databa.patch | text/x-patch | 2.0 KB |
| v8-0003-Use-ProcNumber-rather-than-pid-in-ReplicationSlot.patch | text/x-patch | 11.1 KB |
| v8-0004-Separate-RecoveryConflictReasons-from-procsignals.patch | text/x-patch | 37.7 KB |
| v8-0005-Refactor-ProcessRecoveryConflictInterrupt-for-rea.patch | text/x-patch | 15.6 KB |
| v8-0006-Centralize-resetting-SIGCHLD-handler.patch | text/x-patch | 9.7 KB |
| v8-0007-Use-standard-die-handler-for-SIGTERM-in-bgworkers.patch | text/x-patch | 2.0 KB |
| v8-0008-Replace-Latches-with-Interrupts.patch | text/x-patch | 508.3 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-01-29 01:18:49 | ERROR: failed to find conversion function from unknown to text |
| Previous Message | surya poondla | 2026-01-29 00:56:19 | Re: Add comments about fire_triggers argument in ri_triggers.c |