From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Christoph Berg <myon(at)debian(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Date: | 2025-06-24 13:25:07 |
Message-ID: | aFqnM/iiL+MB62dG@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Hi,
On Tue, Jun 24, 2025 at 02:33:59PM +0200, Tomas Vondra wrote:
>
>
> On 6/24/25 13:10, Bertrand Drouvot wrote:
> > So, if we look at do_pages_stat() ([1]), we can see that it uses an hardcoded
> > "#define DO_PAGES_STAT_CHUNK_NR 16UL" and that this pointers arithmetic:
> >
> > "
> > pages += chunk_nr;
> > status += chunk_nr;
> > "
> >
> > is done but has no effect since nr_pages will exit the loop if we use a batch
> > size <= 16.
> >
> > So if this pointer arithmetic is not correct, (it seems that it should advance
> > by 16 * sizeof(compat_uptr_t) instead) then it has no effect as long as the batch
> > size is <= 16.
> >
> > Does test_chunk_size also fails at 17 for you?
>
> Yes, it fails for me at 17 too. So you're saying the access within each
> chunk of 16 elements is OK, but that maybe advancing to the next chunk
> is not quite right?
Yes, I think compat_uptr_t usage is missing in do_pages_stat() (while it's used
in do_pages_move()).
Having a chunk size <= DO_PAGES_STAT_CHUNK_NR ensures we are not affected
by the wrong pointer arithmetic.
> In which case limiting the access to 16 entries
> might be a workaround.
Yes, something like:
diff --git a/src/backend/storage/ipc/shmem.c b/src/backend/storage/ipc/shmem.c
index c9ae3b45b76..070ad2f13e7 100644
--- a/src/backend/storage/ipc/shmem.c
+++ b/src/backend/storage/ipc/shmem.c
@@ -689,8 +689,17 @@ pg_get_shmem_allocations_numa(PG_FUNCTION_ARGS)
CHECK_FOR_INTERRUPTS();
}
- if (pg_numa_query_pages(0, shm_ent_page_count, page_ptrs, pages_status) == -1)
- elog(ERROR, "failed NUMA pages inquiry status: %m");
+ #define NUMA_QUERY_CHUNK_SIZE 16 /* has to be <= DO_PAGES_STAT_CHUNK_NR (do_pages_stat())*/
+
+ for (uint64 chunk_start = 0; chunk_start < shm_ent_page_count; chunk_start += NUMA_QUERY_CHUNK_SIZE) {
+ uint64 chunk_size = Min(NUMA_QUERY_CHUNK_SIZE, shm_ent_page_count - chunk_start);
+
+ if (pg_numa_query_pages(0, chunk_size, &page_ptrs[chunk_start],
+ &pages_status[chunk_start]) == -1)
+ elog(ERROR, "failed NUMA pages inquiry status: %m");
+ }
+
+ #undef NUMA_QUERY_CHUNK_SIZE
> In any case, this sounds like a kernel bug, right?
yes it sounds like a kernel bug.
> I don't have much
> experience with the kernel code, so don't want to rely too much on my
> interpretation of it.
I don't have that much experience too but I think the issue is in do_pages_stat()
and that "pages += chunk_nr" should be advanced by sizeof(compat_uptr_t) instead.
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Melanie Plageman | 2025-06-24 13:29:44 | pgsql: Test that vacuum removes tuples older than OldestXmin |
Previous Message | Tomas Vondra | 2025-06-24 12:42:43 | Re: pgsql: Introduce pg_shmem_allocations_numa view |
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2025-06-24 13:29:51 | Re: SQL:2023 JSON simplified accessor support |
Previous Message | Melanie Plageman | 2025-06-24 13:17:14 | Re: Simplify VM counters in vacuum code |