Re: pgsql: Introduce pg_shmem_allocations_numa view

From: Christoph Berg <myon(at)debian(dot)org>
To: Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(at)vondra(dot)me>, Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Introduce pg_shmem_allocations_numa view
Date: 2025-06-24 14:41:33
Message-ID: aFq5HQj016rVS2lm@msg.df7cb.de
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Re: Bertrand Drouvot
> Yes, I think compat_uptr_t usage is missing in do_pages_stat() (while it's used
> in do_pages_move()).

I was also reading the kernel source around that place but you spotted
the problem before me. This patch resolves the issue here:

diff --git a/mm/migrate.c b/mm/migrate.c
index 8cf0f9c9599..542c81ec3ed 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -2444,7 +2444,13 @@ static int do_pages_stat(struct mm_struct *mm, unsigned long nr_pages,
if (copy_to_user(status, chunk_status, chunk_nr * sizeof(*status)))
break;

- pages += chunk_nr;
+ if (in_compat_syscall()) {
+ compat_uptr_t __user *pages32 = (compat_uptr_t __user *)pages;
+
+ pages32 += chunk_nr;
+ pages = (const void __user * __user *) pages32;
+ } else
+ pages += chunk_nr;
status += chunk_nr;
nr_pages -= chunk_nr;
}

> Having a chunk size <= DO_PAGES_STAT_CHUNK_NR ensures we are not affected
> by the wrong pointer arithmetic.

Good idea. Buggy kernels will be around for some time.

> + #define NUMA_QUERY_CHUNK_SIZE 16 /* has to be <= DO_PAGES_STAT_CHUNK_NR (do_pages_stat())*/
> +
> + for (uint64 chunk_start = 0; chunk_start < shm_ent_page_count; chunk_start += NUMA_QUERY_CHUNK_SIZE) {

Perhaps optimize it to this:

#if sizeof(void *) == 4
#define NUMA_QUERY_CHUNK_SIZE 16 /* has to be <= DO_PAGES_STAT_CHUNK_NR (do_pages_stat())*/
#else
#define NUMA_QUERY_CHUNK_SIZE 1024
#endif

... or some other bigger number.

The loop could also include CHECK_FOR_INTERRUPTS();

> > I don't have much
> > experience with the kernel code, so don't want to rely too much on my
> > interpretation of it.
>
> I don't have that much experience too but I think the issue is in do_pages_stat()
> and that "pages += chunk_nr" should be advanced by sizeof(compat_uptr_t) instead.

Me neither, but I'll try submit this fix.

Christoph

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Tomas Vondra 2025-06-24 15:04:00 Re: pgsql: Introduce pg_shmem_allocations_numa view
Previous Message Masahiko Sawada 2025-06-24 14:08:45 pgsql: Fix cache-dependent test failures in logical decoding.

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2025-06-24 15:04:00 Re: pgsql: Introduce pg_shmem_allocations_numa view
Previous Message Tom Lane 2025-06-24 14:13:00 Re: Safeguards against incorrect fd flags for fsync()