Re: Adding basic NUMA awareness

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Jakub Wartak <jakub(dot)wartak(at)enterprisedb(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Adding basic NUMA awareness
Date: 2025-07-17 21:14:52
Message-ID: b877e417-32c8-4864-b89e-cf6c66c1196d@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 7/4/25 20:12, Tomas Vondra wrote:
> On 7/4/25 13:05, Jakub Wartak wrote:
>> ...
>>
>> 8. v1-0005 2x + /* if (numa_procs_interleave) */
>>
>> Ha! it's a TRAP! I've uncommented it because I wanted to try it out
>> without it (just by setting GUC off) , but "MyProc->sema" is NULL :
>>
>> 2025-07-04 12:31:08.103 CEST [28754] LOG: starting PostgreSQL
>> 19devel on x86_64-linux, compiled by gcc-12.2.0, 64-bit
>> [..]
>> 2025-07-04 12:31:08.109 CEST [28754] LOG: io worker (PID 28755)
>> was terminated by signal 11: Segmentation fault
>> 2025-07-04 12:31:08.109 CEST [28754] LOG: terminating any other
>> active server processes
>> 2025-07-04 12:31:08.114 CEST [28754] LOG: shutting down because
>> "restart_after_crash" is off
>> 2025-07-04 12:31:08.116 CEST [28754] LOG: database system is shut down
>>
>> [New LWP 28755]
>> [Thread debugging using libthread_db enabled]
>> Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
>> Core was generated by `postgres: io worker '.
>> Program terminated with signal SIGSEGV, Segmentation fault.
>> #0 __new_sem_wait_fast (definitive_result=1, sem=sem(at)entry=0x0)
>> at ./nptl/sem_waitcommon.c:136
>> 136 ./nptl/sem_waitcommon.c: No such file or directory.
>> (gdb) where
>> #0 __new_sem_wait_fast (definitive_result=1, sem=sem(at)entry=0x0)
>> at ./nptl/sem_waitcommon.c:136
>> #1 __new_sem_trywait (sem=sem(at)entry=0x0) at ./nptl/sem_wait.c:81
>> #2 0x00005561918e0cac in PGSemaphoreReset (sema=0x0) at
>> ../src/backend/port/posix_sema.c:302
>> #3 0x0000556191970553 in InitAuxiliaryProcess () at
>> ../src/backend/storage/lmgr/proc.c:992
>> #4 0x00005561918e51a2 in AuxiliaryProcessMainCommon () at
>> ../src/backend/postmaster/auxprocess.c:65
>> #5 0x0000556191940676 in IoWorkerMain (startup_data=<optimized
>> out>, startup_data_len=<optimized out>) at
>> ../src/backend/storage/aio/method_worker.c:393
>> #6 0x00005561918e8163 in postmaster_child_launch
>> (child_type=child_type(at)entry=B_IO_WORKER, child_slot=20086,
>> startup_data=startup_data(at)entry=0x0,
>> startup_data_len=startup_data_len(at)entry=0,
>> client_sock=client_sock(at)entry=0x0) at
>> ../src/backend/postmaster/launch_backend.c:290
>> #7 0x00005561918ea09a in StartChildProcess
>> (type=type(at)entry=B_IO_WORKER) at
>> ../src/backend/postmaster/postmaster.c:3973
>> #8 0x00005561918ea308 in maybe_adjust_io_workers () at
>> ../src/backend/postmaster/postmaster.c:4404
>> [..]
>> (gdb) print *MyProc->sem
>> Cannot access memory at address 0x0
>>
>
> Yeah, good catch. I'll look into that next week.
>

I've been unable to reproduce this issue, but I'm not sure what settings
you actually used for this instance. Can you give me more details how to
reproduce this?

regards

--
Tomas Vondra

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Nathan Bossart 2025-07-17 21:34:16 Re: [PATCH] Add tests for binaryheap.c
Previous Message Tomas Vondra 2025-07-17 21:11:16 Re: Adding basic NUMA awareness