| From: | Christoph Berg <myon(at)debian(dot)org> |
|---|---|
| To: | Tomas Vondra <tomas(at)vondra(dot)me> |
| Cc: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: failed NUMA pages inquiry status: Operation not permitted |
| Date: | 2025-12-11 12:29:14 |
| Message-ID: | aTq5Gt_n-oS_QSpL@msg.df7cb.de |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers pgsql-hackers |
Re: Tomas Vondra
> >> So I'm leaning to adjust pg_numa_init() to also check EPERM, per the
> >> attached patch. It still calls numa_available(), so that we don't
> >> silently miss future libnuma changes.
> >>
> >> Can you check this makes it work inside the docker container?
> >
> > Yes your patch works. (Sorry I meant to test earlier, but RL...)
>
> Thanks. I've pushed the fix (and backpatched to 18).
It looks like we are not done here yet :(
postgresql-18 is failing here intermittently with this diff:
12:20:24 --- /build/reproducible-path/postgresql-18-18.1/src/test/regress/expected/numa.out 2025-11-10 21:52:06.000000000 +0000
12:20:24 +++ /build/reproducible-path/postgresql-18-18.1/build/src/test/regress/results/numa.out 2025-12-11 11:20:22.618989603 +0000
12:20:24 @@ -6,8 +6,4 @@
12:20:24 -- switch to superuser
12:20:24 \c -
12:20:24 SELECT COUNT(*) >= 0 AS ok FROM pg_shmem_allocations_numa;
12:20:24 - ok
12:20:24 -----
12:20:24 - t
12:20:24 -(1 row)
12:20:24 -
12:20:24 +ERROR: invalid NUMA node id outside of allowed range [0, 0]: -2
That's REL_18_STABLE @ 580b5c, with the Debian packaging on top.
I've seen it on unstable/amd64, unstable/arm64, and Ubuntu
questing/amd64, where libnuma should take care of this itself, without
the extra patch in PG. There was another case on bullseye/amd64 which
has the old libnuma.
It's been frequent enough so it killed 4 out of the 10 builds
currently visible on
https://jengus.postgresql.org/job/postgresql-18-binaries-snapshot/.
(Though to be fair, only one distribution/arch combination was failing
for each of them.)
There is also one instance of it in
https://jengus.postgresql.org/job/postgresql-19-binaries-snapshot/
I currently have no idea what's happening.
Christoph
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Tomas Vondra | 2025-12-11 12:46:54 | Re: failed NUMA pages inquiry status: Operation not permitted |
| Previous Message | Heikki Linnakangas | 2025-12-11 09:31:35 | pgsql: Add runtime checks for bogus multixact offsets |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Amit Kapila | 2025-12-11 12:43:07 | Re: Proposal: Cascade REPLICA IDENTITY changes to leaf partitions |
| Previous Message | Amit Kapila | 2025-12-11 12:26:59 | Re: Proposal: Conflict log history table for Logical Replication |