From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Christoph Berg <myon(at)debian(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Date: | 2025-06-24 08:24:53 |
Message-ID: | aFpg1de9ZfS1QgUt@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Hi,
On Tue, Jun 24, 2025 at 03:43:19AM +0200, Tomas Vondra wrote:
> On 6/23/25 23:47, Tomas Vondra wrote:
> > ...
> >
> > Or maybe the 32-bit chroot on 64-bit host matters and confuses some
> > calculation.
> >
>
> I think it's likely something like this.
I think the same.
> I noticed that if I modify
> pg_buffercache_numa_pages() to query the addresses one by one, it works.
> And when I increase the number, it stops working somewhere between 16k
> and 17k items.
Yeah, same for me with pg_get_shmem_allocations_numa(). It works if
pg_numa_query_pages() is done on chunks <= 16 pages but fails if done on more
than 16 pages.
It's also confirmed by test_chunk_size.c attached:
$ gcc-11 -m32 -o test_chunk_size test_chunk_size.c
$ ./test_chunk_size
1 pages: SUCCESS (0 errors)
2 pages: SUCCESS (0 errors)
3 pages: SUCCESS (0 errors)
4 pages: SUCCESS (0 errors)
5 pages: SUCCESS (0 errors)
6 pages: SUCCESS (0 errors)
7 pages: SUCCESS (0 errors)
8 pages: SUCCESS (0 errors)
9 pages: SUCCESS (0 errors)
10 pages: SUCCESS (0 errors)
11 pages: SUCCESS (0 errors)
12 pages: SUCCESS (0 errors)
13 pages: SUCCESS (0 errors)
14 pages: SUCCESS (0 errors)
15 pages: SUCCESS (0 errors)
16 pages: SUCCESS (0 errors)
17 pages: 1 errors
Threshold: 17 pages
No error if -m32 is not used.
> It may be a coincidence, but I suspect it's related to the sizeof(void
> *) being 8 in the kernel, but only 4 in the chroot. So the userspace
> passes an array of 4-byte items, but kernel interprets that as 8-byte
> items. That is, we call
>
> long move_pages(int pid, unsigned long count, void *pages[.count], const
> int nodes[.count], int status[.count], int flags);
>
> Which (I assume) just passes the parameters to kernel. And it'll
> interpret them per kernel pointer size.
>
I also suspect something in this area...
> If this is what's happening, I'm not sure what to do about it ...
We could work by chunks (16?) on 32 bits but would probably produce performance
degradation (we mention it in the doc though). Also would always 16 be a correct
chunk size?
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
Attachment | Content-Type | Size |
---|---|---|
test_chunk_size.c | text/x-csrc | 1.6 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2025-06-24 09:20:15 | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Previous Message | Fujii Masao | 2025-06-24 05:29:28 | pgsql: doc: Fix incorrect UUID index entry in function documentation. |
From | Date | Subject | |
---|---|---|---|
Next Message | Nazir Bilal Yavuz | 2025-06-24 08:27:28 | Re: [PATCH] Fix OAuth feature detection on OpenBSD+Meson |
Previous Message | jian he | 2025-06-24 08:06:56 | Re: Add SPLIT PARTITION/MERGE PARTITIONS commands |