From: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com> |
---|---|
To: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
Cc: | Michael Banck <mbanck(at)gmx(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Michael Paquier <michael(at)paquier(dot)xyz>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: GNU/Hurd portability patches |
Date: | 2025-10-12 00:42:30 |
Message-ID: | CA+hUKGKo6rToAxyObEnOnOiT3QEJuJFVkVNqikw0wVFHag=3Fg@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sun, Oct 12, 2025 at 1:00 AM Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
> !!!wrapper_handler[1988]| postgres_signal_arg: 30, PG_NSIG: 33
> !!!wrapper_handler[1989]| postgres_signal_arg: 30, PG_NSIG: 33
> !!!wrapper_handler[3284]| postgres_signal_arg: 14, PG_NSIG: 33
> !!!wrapper_handler[3284]| postgres_signal_arg: 28476608, PG_NSIG: 33
> TRAP: failed Assert("postgres_signal_arg < PG_NSIG"), File: "pqsignal.c", Line: 94, PID: 3284
Hmm. We only install the handler for real signal numbers, and it
clearly managed to find the handler, so then how did it corrupt signo
before calling the function? I wonder if there could concurrency bugs
reached by our perhaps unusually large amount of signaling (we have
found bugs in the signal implementations of several other OSes...).
This might be the code:
https://github.com/bminor/glibc/blob/master/hurd/hurdsig.c#L639
It appears to suspend the thread selected to handle the signal, mess
with its stack/context and then resume it, just like traditional
monokernels, it's just done in user space by code running in a helper
thread that communicates over Mach ports. So it looks like I
misunderstood that comment in the docs, it's not the handler itself
that runs in a different thread, unless I'm looking at the wrong code
(?).
Some random thoughts after skim-reading that and
glibc/sysdeps/mach/hurd/x86/trampoline.c:
* I wonder if setting up sigaltstack() and then using SA_ONSTACK in
pqsignal() would behave differently, though SysV AMD64 calling
conventions (used by Hurd IIGC) have the first argument in %rdi, not
the stack, so I don't really expect that to be relevant...
* I wonder about the special code paths for handlers that were already
running and happened to be in sigreturn(), or something like that,
which I didn't study at all, but it occurred to me that our pqsignal
will only block the signal itself while running a handler (since it
doesn't specify SA_NODEFER)... so what happens if you block all
signals while running each handler by changing
sigemptyset(&act.sa_mask) to sigfillset(&act.sa_mask)?
* I see special code paths for threads that were in (its notion of)
critical sections, which must be rare, but it looks like that just
leave it pending which seems reasonable
* I see special code paths for SIGIO and SIGURG that I didn't try to
understand, but I wonder what would happen if we s/SIGURG/SIGXCPU/
(I will hopefully soon be able to share a branch that would get rid of
almost all signals, and optionally use pipes or futexes or other
tricks instead, depending on build options, working on that...)
> > I've so far resisted the urge to spin up a Debian GNU/Hurd box to
> > figure any of that out for myself, but maybe someone has a clue...
>
> That's pretty wise — the most frustrating thing with Hurd VM, which I
> created as described above, is that it hangs during tests (only 1 out of
> 5 `make check` runs completes) and killing the hanging processes doesn't
> restore it's working state — I have to reboot it (and fsck finds FS errors
> on each reboot) or even restore a copy of VM's disk.
Huh, so we're doing something unusual enough to de-stabilise some
fundamental service... Has any of this reached the Hurd mailing
lists?
Some more wild and uninformed guesses, while thinking about how to
narrow a bug report down: if the file system is inconsistent after
it's had plenty of time to finish writing disk blocks before you
rebooted it and needing fsck to fix, perhaps that means that ext2fs
(which I understand to be a user space process that manages the file
system[1]) has locked up? Of course it could easily be something
else, who knows, but that makes me wonder about the more exotic file
system operations we use. Looking at fruitcrow's configure output, I
see that it doesn't have fadvise or sync_file_range, but it does have
pwritev/preadv and posix_fallocate. They probably don't get much
exercise in other software... so maybe try telling PostgreSQL that we
don't have 'em and see what happens? It might also be related to our
vigorous renaming, truncating, fsyncing activities... It looks like
the only other plausible file system might be an NFS mount... does it
work any better?
Thinking of other maybe-slightly-unusual things in the signal
processing area that have been problematic in a couple of other OSes
(ie systems that added emulations of Linux system calls), I wondered
about epoll and signalfd, but it doesn't have those either, so it must
be using plain old poll() with the widely used self-pipe trick for
latches, and that doesn't seem likely to be new or buggy code.
From | Date | Subject | |
---|---|---|---|
Next Message | Tom Lane | 2025-10-12 01:25:20 | Re: [PING] [PATCH v2] parallel pg_restore: avoid disk seeks when jumping short distance forward |
Previous Message | Peter Smith | 2025-10-11 22:55:07 | Re: Add support for specifying tables in pg_createsubscriber. |