Re: Adding basic NUMA awareness

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding basic NUMA awareness
Date: 2025-07-25 10:51:41
Message-ID: 5a2410f8-62a9-4483-bf0a-3a8331fb0808@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/25/25 12:27, Jakub Wartak wrote:
> On Thu, Jul 17, 2025 at 11:15 PM Tomas Vondra <tomas(at)vondra(dot)me> wrote:
>>
>> On 7/4/25 20:12, Tomas Vondra wrote:
>>> On 7/4/25 13:05, Jakub Wartak wrote:
>>>> ...
>>>>
>>>> 8. v1-0005 2x + /* if (numa_procs_interleave) */
>>>>
>>>> Ha! it's a TRAP! I've uncommented it because I wanted to try it out
>>>> without it (just by setting GUC off) , but "MyProc->sema" is NULL :
>>>>
>>>> 2025-07-04 12:31:08.103 CEST [28754] LOG: starting PostgreSQL
>>>> 19devel on x86_64-linux, compiled by gcc-12.2.0, 64-bit
>>>> [..]
>>>> 2025-07-04 12:31:08.109 CEST [28754] LOG: io worker (PID 28755)
>>>> was terminated by signal 11: Segmentation fault
>>>> 2025-07-04 12:31:08.109 CEST [28754] LOG: terminating any other
>>>> active server processes
>>>> 2025-07-04 12:31:08.114 CEST [28754] LOG: shutting down because
>>>> "restart_after_crash" is off
>>>> 2025-07-04 12:31:08.116 CEST [28754] LOG: database system is shut down
>>>>
>>>> [New LWP 28755]
>>>> [Thread debugging using libthread_db enabled]
>>>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>>>> Core was generated by `postgres: io worker '.
>>>> Program terminated with signal SIGSEGV, Segmentation fault.
>>>> #0 __new_sem_wait_fast (definitive_result=1, sem=sem(at)entry=0x0)
>>>> at ./nptl/sem_waitcommon.c:136
>>>> 136 ./nptl/sem_waitcommon.c: No such file or directory.
>>>> (gdb) where
>>>> #0 __new_sem_wait_fast (definitive_result=1, sem=sem(at)entry=0x0)
>>>> at ./nptl/sem_waitcommon.c:136
>>>> #1 __new_sem_trywait (sem=sem(at)entry=0x0) at ./nptl/sem_wait.c:81
>>>> #2 0x00005561918e0cac in PGSemaphoreReset (sema=0x0) at
>>>> ../src/backend/port/posix_sema.c:302
>>>> #3 0x0000556191970553 in InitAuxiliaryProcess () at
>>>> ../src/backend/storage/lmgr/proc.c:992
>>>> #4 0x00005561918e51a2 in AuxiliaryProcessMainCommon () at
>>>> ../src/backend/postmaster/auxprocess.c:65
>>>> #5 0x0000556191940676 in IoWorkerMain (startup_data=<optimized
>>>> out>, startup_data_len=<optimized out>) at
>>>> ../src/backend/storage/aio/method_worker.c:393
>>>> #6 0x00005561918e8163 in postmaster_child_launch
>>>> (child_type=child_type(at)entry=B_IO_WORKER, child_slot=20086,
>>>> startup_data=startup_data(at)entry=0x0,
>>>> startup_data_len=startup_data_len(at)entry=0,
>>>> client_sock=client_sock(at)entry=0x0) at
>>>> ../src/backend/postmaster/launch_backend.c:290
>>>> #7 0x00005561918ea09a in StartChildProcess
>>>> (type=type(at)entry=B_IO_WORKER) at
>>>> ../src/backend/postmaster/postmaster.c:3973
>>>> #8 0x00005561918ea308 in maybe_adjust_io_workers () at
>>>> ../src/backend/postmaster/postmaster.c:4404
>>>> [..]
>>>> (gdb) print *MyProc->sem
>>>> Cannot access memory at address 0x0
>>>>
>>>
>>> Yeah, good catch. I'll look into that next week.
>>>
>>
>> I've been unable to reproduce this issue, but I'm not sure what settings
>> you actually used for this instance. Can you give me more details how to
>> reproduce this?
>
> Better late than never, well feel free to partially ignore me, i've
> missed that it is known issue as per FIXME there, but I would just rip
> out that commented out `if(numa_proc_interleave)` from
> FastPathLockShmemSize() and PGProcShmemSize() unless you want to save
> those memory pages of course (in case of no-NUMA). If you do want to
> save those pages I think we have problem:
>
> For complete picture, steps:
>
> 1. patch -p1 < v2-0001-NUMA-interleaving-buffers.patch
> 2. patch -p1 < v2-0006-NUMA-interleave-PGPROC-entries.patch
>
> BTW the pgbench accidentinal ident is still there (part of v2-0001 patch))
> 14 out of 14 hunks FAILED -- saving rejects to file
> src/bin/pgbench/pgbench.c.rej
>
> 3. As I'm just applying 0001 and 0006, I've got two simple rejects,
> but fixed it (due to not applying missing numa_ freelist patches).
> That's intentional on my part, because I wanted to play just with
> those two.
>
> 4. Then I uncomment those two "if (numa_procs_interleave)" related for
> optional memory shm initialization - add_size() and so on (that have
> XXX comment above that it is causing bootstrap issues)
>

Ah, I didn't realize you uncommented these "if" conditions. In that case
the crash is not very surprising, because the actual initialization in
InitProcGlobal ignores the GUCs and just assumes it's enabled. But
without the extra padding that likely messes up something. Or something
allocated later "overwrites" the some of the memory.

I need to clean this up, to actually consider the GUC properly.

FWIW I do have a new patch version that I plan to share in a day or two,
once I get some numbers. It didn't change this particular part, though,
it's more about the buffers/freelists/clocksweep. I'll work on PGPROC
next, I think.

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2025-07-25 10:57:37 Re: [Proposal] Expose internal MultiXact member count function for efficient monitoring
Previous Message Amit Kapila 2025-07-25 10:41:09 Re: Enhance pg_createsubscriber to create required standby.