| From: | Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
|---|---|
| To: | Andres Freund <andres(at)anarazel(dot)de> |
| Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, Fujii Masao <masao(dot)fujii(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Interrupts vs signals |
| Date: | 2026-06-24 20:25:59 |
| Message-ID: | 0aac2b1a-ec0f-4bb6-a786-760e6cea4d1d@iki.fi |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On 10/03/2026 21:43, Andres Freund wrote:
> On 2026-02-20 16:22:48 +0200, Heikki Linnakangas wrote:
>> Patch attached. The relevant changes for this "attention mechanism" are in
>> the last patch, but there are some other small changes too so this split
>> into patches is a little messy. I moved the list of standard interrupts to a
>> separate header file, for example. So for reviewing, I recommend reading the
>> resulting interrupt.h and interrupt.c files after applying all the patches,
>> instead of trying to read the diff for those.
>
> I was just looking at these patches. They needed a rebase. Attached without
> other changes.
Thanks! Here's a new version, rebased again and with a big bunch of
little fixes.
> A few comments:
>
> - I don't think I really understand the interrupts.h / standard_interrupts.h
> split. Also, the latter doesn't have the normal function header etc.
Added standard file header. The idea is that "standard_interrupts.h"
contains the definitions of all the different interrupt, and prototypes
for functions related to them, whereas "interrupts.h" contains
definitions for the interrupt system as whole, with no knowledge of what
any of the individual bits mean.
> - It probably doesn't matter, but it seems that for SendOrRaiseInterrupt()
> could we skip the setting of PI_FLAG_ATTENTION if already set?
Hmm, I guess, but I think it'd just make it more complicated to try to
check for that.
> - I see a bunch of core files like this:
>
>
> #4 0x0000000000d6a4d8 in ExceptionalCondition (conditionName=0x1089ff8 "(pg_atomic_read_u32(&MyPendingInterrupts->flags) & PI_FLAG_SLEEPING) == 0",
> fileName=0x1089f70 "../../../../../home/andres/src/postgresql/src/backend/ipc/interrupt.c", lineNumber=262)
> at ../../../../../home/andres/src/postgresql/src/backend/utils/error/assert.c:65
> #5 0x000000000091aa12 in SwitchMyPendingInterruptsPtr (new_ptr=0x161dc30 <LocalPendingInterrupts>) at ../../../../../home/andres/src/postgresql/src/backend/ipc/interrupt.c:262
> #6 0x000000000091aaaa in SwitchToLocalInterrupts () at ../../../../../home/andres/src/postgresql/src/backend/ipc/interrupt.c:313
> #7 0x0000000000b71dca in AuxiliaryProcKill (code=1, arg=4) at ../../../../../home/andres/src/postgresql/src/backend/storage/lmgr/proc.c:1075
> #8 0x0000000000b411c4 in shmem_exit (code=1) at ../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:282
> #9 0x0000000000b40faa in proc_exit_prepare (code=1) at ../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:198
> #10 0x0000000000b40ef6 in proc_exit (code=1) at ../../../../../home/andres/src/postgresql/src/backend/storage/ipc/ipc.c:113
> #11 0x0000000000b53ba6 in WaitEventSetWaitBlock (set=0x257a1b40, cur_timeout=200, occurred_events=0x7fff8d4db130, nevents=1)
> at ../../../../../home/andres/src/postgresql/src/backend/storage/ipc/waiteventset.c:1306
> #12 0x0000000000b53922 in WaitEventSetWait (set=0x257a1b40, timeout=200, occurred_events=0x7fff8d4db130, nevents=1, wait_event_info=83886083)
> at ../../../../../home/andres/src/postgresql/src/backend/storage/ipc/waiteventset.c:1170
> #13 0x000000000091ae79 in WaitInterrupt (interruptMask=122, wakeEvents=41, timeout=200, wait_event_info=83886083)
> at ../../../../../home/andres/src/postgresql/src/backend/ipc/interrupt.c:482
> #14 0x0000000000a7629c in BackgroundWriterMain (startup_data=0x0, startup_data_len=0) at ../../../../../home/andres/src/postgresql/src/backend/postmaster/bgwriter.c:298
> #15 0x0000000000a7869a in postmaster_child_launch (child_type=B_BG_WRITER, child_slot=35, startup_data=0x0, startup_data_len=0, client_sock=0x0)
> at ../../../../../home/andres/src/postgresql/src/backend/postmaster/launch_backend.c:268
>
>
> interestingly that doesn't trigger any test failures.
A-ha, yeah, that doesn't cause test failure because that happens in the
WL_EXIT_ON_PM_DEATH case. Fixed by clearing the PI_FLAG_SLEEPING flag
before bailing out in WaitEventSetWait() subroutines.
> - Are we generally ok with having no memory barriers around
> CHECK_FOR_INTERRUPTS()/InterruptPending()/ConsumeInterrupt()? It might be,
> because we shouldn't rely on immediate processing of the interrupts, but at
> the very least we need some documentation for why that is true.
I think so. There's a comment in InterruptPending() about that. We could
make it more prominent by moving it to CHECK_FOR_INTERRUPTS(). I'm not
sure if this can ever be a problem in practice though.
> Practically it's probably ok, even on architectures with loose memory
> ordering, we don't typically have loops that do something like
>
> while (true)
> {
> CFI();
> very_cpu_intensive_but_nothing_else();
> }
>
> And once you actually wait for an interrupt, lock a page, or ... we'll be
> guaranteed to see the interrupt.
Hmm, even without a barrier, that won't get stuck forever, right?
InterruptPending() uses pg_atomic_read_u64(), which still forces the
value to be read on each iteration (the arg is 'volatile'). There's no
cache coherency but I believe the CPU will *eventually* see the new
value. I don't have a good idea of what "eventually" means on different
architectures, but I'd guess << 1 s.
> - CFI now is:
>
> ; determine address of MyPendingInterruptsFlags
> mov 0x0(%rip),%rax # 3c9 <ExecScan+0x3c9>
> 3c5: R_X86_64_REX_GOTPCRELX MyPendingInterruptsFlags-0x4
>
> ; load *MyPendingInterruptsFlags
> mov (%rax),%rax
>
> ; load the contents of flags
> mov (%rax),%eax
>
> ; test if flags is 0
> test %eax,%eax
>
> ; jump if flags are set
> jne 51b <ExecScan+0x51b>
>
>
> That's one indirection worse than what you had in
> https://www.postgresql.org/message-id/78c102b3-f1a6-4c70-a46f-2f04221b6193%40iki.fi
>
> I don't know if it matters.
AFAICS it's three indirections in both. Here are both again, with the
indirections numbered:
OLD:
; load address of MyPendingInterrupts->flags
(1) lea 0x433f0f(%rip),%rax # 0x9728a0 <MyPendingInterrupts>
; load value of MyPendingInterrupts->flags
(2) mov (%rax),%rax
; test if MyPendingInterrupts->flags == 0
(3) cmpl $0x0,(%rax)
; jump for ProcessInterrupts
jne 0x53e9fd <makeItemUnary+173>
NEW:
; determine address of MyPendingInterruptsFlags
(1) mov 0x0(%rip),%rax # 3c9 <ExecScan+0x3c9>
3c5: R_X86_64_REX_GOTPCRELX
MyPendingInterruptsFlags-0x4
; load *MyPendingInterruptsFlags
(2) mov (%rax),%rax
; load the contents of flags
(3) mov (%rax),%eax
; test if flags is 0
test %eax,%eax
; jump if flags are set
jne 51b <ExecScan+0x51b>
> - This continues to be a very very large commit. As I IIRC suggested before,
> I'd at least introduce ipc/interrupt.h (including adding the #includes for
> it)
Done. There's now a separate commit to move CHECK_FOR_INTERRUPTS() to
ipc/interrupt.h, and a few other preparatory refactorings.
> I'd probably go further and introduce the new infra separately from
> switching over to it, but...
I feel the intermediate state would be too confusing to really help with
the review.
> - Wonder if it's worth putting the WIN32 stuff from InitProcGlobal() into its
> own helper? Probably not, at least until there's another platform specific
> thing?
(InitProcGlobal() is called ProcGlobalShmemInit() now) It seems fine to
me as it is. There's not much WIN32-specific code there, or elsewhere in
these patches.
> - Does it matter that we have a bunch of code that does
> if (InterruptPending(a))
> stuff();
> if (InterruptPending(b))
> stuff();
> if (ConsumeInterrupt(c))
> stuff();
> if (ConsumeInterrupt(d))
> stuff();
>
> that does end up rereading the same variable from shared memory
> repeatedly...
Hmm, it's probably fine, but if there's some place where that's
performance critical, you could do a quick combined check first:
if (InterruptPending(a | b | c | d))
{
if (InterruptPending(a))
stuff();
if (InterruptPending(b))
stuff();
if (ConsumeInterrupt(c))
stuff();
if (ConsumeInterrupt(d))
stuff();
}
Or we could add a function to return the pending mask so that you'd only
need to read it once. I haven't really felt the need for that so far,
but it would seem pretty logical to have one.
> - Personally I rather dislike anonymous struct types (like PendingInterrupts),
> as some debuggers and similar tooling won't be able to show names for them.
Fixed
> - I wonder if we should abstract the PG_HAVE_ATOMIC_U64_SIMULATION crap
> somehow (or just remove platforms without it, but I don't think I want to be
> the one doing that fight)
I think it's fine as it is, there are only a few functions that need to
check PG_HAVE_ATOMIC_U64_SIMULATION currently. (I'd be all in favor of
desupporting PG_HAVE_ATOMIC_U64_SIMULATION though)
- Heikki
| Attachment | Content-Type | Size |
|---|---|---|
| v13-0001-Refactor-how-some-aux-processes-advertise-their-.patch | text/x-patch | 9.6 KB |
| v13-0002-Centralize-resetting-SIGCHLD-handler.patch | text/x-patch | 9.8 KB |
| v13-0003-Refactor-datachecksums-launcher-worker-comms-to-.patch | text/x-patch | 5.4 KB |
| v13-0004-Refactor-PARALLEL_MESSAGE-procsignal-signaling.patch | text/x-patch | 15.0 KB |
| v13-0005-Move-CHECK_FOR_INTERRUPTS-and-friends-to-new-ipc.patch | text/x-patch | 116.7 KB |
| v13-0006-Rename-postmaster-interrupt.c-to-ipc-signal_hand.patch | text/x-patch | 20.1 KB |
| v13-0007-Remove-dependency-on-utils-resowner.h-just-for-R.patch | text/x-patch | 2.2 KB |
| v13-0008-Remove-unnecessary-include-from-tcopprot.h.patch | text/x-patch | 4.3 KB |
| v13-0009-Replace-Latches-with-Interrupts.patch | text/x-patch | 495.1 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2026-06-24 20:26:29 | Re: Adding basic NUMA awareness |
| Previous Message | Jelte Fennema-Nio | 2026-06-24 20:16:49 | Re: The PostgreSQL C Dialect |