| From: | Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com> |
|---|---|
| To: | Tomas Vondra <tomas(at)vondra(dot)me> |
| Cc: | Christoph Berg <myon(at)debian(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: failed NUMA pages inquiry status: Operation not permitted |
| Date: | 2026-01-19 11:47:19 |
| Message-ID: | CAKZiRmxo-umL9889gD=Z2SZBG4y64qL_PrTmeimcykqDGwdNBQ@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-committers pgsql-hackers |
On Fri, Jan 16, 2026 at 10:29 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>
> Hi,
>
> Here's WIP fix for the root cause, i.e. handling status -2 in the two
> views querying NUMA node for memory pages:
>
> * pg_shmem_allocations_numa
> * pg_buffercache_numa
>
> We can't prevent -2 from happening - the kernel can move arbitrary pages
> to swap, we have no control over it. So I think we need to handle -2 as
> "unknown" node, instead of failing. The patch simply returns NULL
> instead of the node, but in principle we might return some other value
> (but IMHO we should not return the raw status, the -2 makes no sense in
> our context, it's some internal kernel errno).
>
> The pg_buffercache_numa was not failing, it just returned the -2 status
> verbatim. But I modified it to return NULL, for consistency.
>
> AFAIK this will fix the regression tests too - they only check COUNT(*),
> not the actual values.
>
> I'm not sure if we need to mention this in the docs. It probably should
> mention the column can be NULL, which means "unknown node".
Right, OK, so I've reproduced this without patch (as You have stated, just cause
shared_buffers to swap out, in my case it was simple stress-ng -m 16 --vm-bytes
SOME_HIGH_VALUE).
It gets ERROR pretty fast: select numa_node, sum(size) from
pg_shmem_allocations_numa group by numa_node;
numa_node | sum
-----------+-------------
0 | 24062603264
(1 row)
and then after pretty soon:
ERROR: invalid NUMA node id outside of allowed range [0, 0]: -2
but with patch it (which by the way looks good to me), it does not,
instead I get:
numa_node | sum
-----------+-------------
| 10821046272
0 | 13241556992
-J.
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Peter Eisentraut | 2026-01-19 16:26:42 | Re: pgsql: Add the MODE option to the WAIT FOR LSN command |
| Previous Message | Richard Guo | 2026-01-19 02:15:50 | pgsql: Fix unsafe pushdown of quals referencing grouping Vars |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Aleksander Alekseev | 2026-01-19 11:52:06 | Re: Enhance btree's pageinspect |
| Previous Message | Hayato Kuroda (Fujitsu) | 2026-01-19 11:37:52 | RE: code contributions for 2025, WIP version |