From: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com> |
---|---|
To: | Tomas Vondra <tomas(at)vondra(dot)me> |
Cc: | Christoph Berg <myon(at)debian(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org |
Subject: | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Date: | 2025-06-24 11:10:24 |
Message-ID: | aFqHoNXQ/uWKXJ4U@ip-10-97-1-34.eu-west-3.compute.internal |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-committers pgsql-hackers |
Hi,
On Tue, Jun 24, 2025 at 11:20:15AM +0200, Tomas Vondra wrote:
> On 6/24/25 10:24, Bertrand Drouvot wrote:
> > Yeah, same for me with pg_get_shmem_allocations_numa(). It works if
> > pg_numa_query_pages() is done on chunks <= 16 pages but fails if done on more
> > than 16 pages.
> >
> > It's also confirmed by test_chunk_size.c attached:
> >
> > $ gcc-11 -m32 -o test_chunk_size test_chunk_size.c
> > $ ./test_chunk_size
> > 1 pages: SUCCESS (0 errors)
> > 2 pages: SUCCESS (0 errors)
> > 3 pages: SUCCESS (0 errors)
> > 4 pages: SUCCESS (0 errors)
> > 5 pages: SUCCESS (0 errors)
> > 6 pages: SUCCESS (0 errors)
> > 7 pages: SUCCESS (0 errors)
> > 8 pages: SUCCESS (0 errors)
> > 9 pages: SUCCESS (0 errors)
> > 10 pages: SUCCESS (0 errors)
> > 11 pages: SUCCESS (0 errors)
> > 12 pages: SUCCESS (0 errors)
> > 13 pages: SUCCESS (0 errors)
> > 14 pages: SUCCESS (0 errors)
> > 15 pages: SUCCESS (0 errors)
> > 16 pages: SUCCESS (0 errors)
> > 17 pages: 1 errors
> > Threshold: 17 pages
> >
> > No error if -m32 is not used.
> >
> > We could work by chunks (16?) on 32 bits but would probably produce performance
> > degradation (we mention it in the doc though). Also would always 16 be a correct
> > chunk size?
>
> I don't see how this would solve anything?
>
> AFAICS the problem is the two places are confused about how large the
> array elements are, and get to interpret that differently.
> I don't see how using smaller array makes this correct. That it works is
> more a matter of luck,
Not sure it's luck, maybe the wrong pointers arithmetic has no effect if batch
size is <= 16.
So we have kernel_move_pages() -> kernel_move_pages() (because nodes is NULL here
for us as we call "numa_move_pages(pid, count, pages, NULL, status, 0);").
So, if we look at do_pages_stat() ([1]), we can see that it uses an hardcoded
"#define DO_PAGES_STAT_CHUNK_NR 16UL" and that this pointers arithmetic:
"
pages += chunk_nr;
status += chunk_nr;
"
is done but has no effect since nr_pages will exit the loop if we use a batch
size <= 16.
So if this pointer arithmetic is not correct, (it seems that it should advance
by 16 * sizeof(compat_uptr_t) instead) then it has no effect as long as the batch
size is <= 16.
Does test_chunk_size also fails at 17 for you?
[1]: https://github.com/torvalds/linux/blob/master/mm/migrate.c
Regards,
--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-06-24 11:10:50 | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Previous Message | Daniel Gustafsson | 2025-06-24 10:01:16 | pgsql: doc: Remove dead link to NewbieDoc Docbook Guide |
From | Date | Subject | |
---|---|---|---|
Next Message | Andres Freund | 2025-06-24 11:10:50 | Re: pgsql: Introduce pg_shmem_allocations_numa view |
Previous Message | Ajin Cherian | 2025-06-24 10:48:18 | Re: Improve pg_sync_replication_slots() to wait for primary to advance |