From: | Steven Niu <niushiji(at)gmail(dot)com> |
---|---|
To: | Mikhail Kot <mikhail(dot)kot(at)databricks(dot)com>, "pgsql-hackers(at)lists(dot)postgresql(dot)org" <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | "to(at)myrrc(dot)dev" <to(at)myrrc(dot)dev> |
Subject: | 回复: Fix segfault while accessing half-initialized hash table in pgstat_shmem.c |
Date: | 2025-09-03 07:22:00 |
Message-ID: | MN2PR15MB302160E8AC4AA87DE2B95183A701A@MN2PR15MB3021.namprd15.prod.outlook.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
I found there are many cases of following pattern:
ptr_1 = dsa_allocate();
ptr_2 = dsa_get_address(xxx, ptr_1);
ptr_2->yyy = zzz;
Inside dsa_get_address(dsa_area *area, dsa_pointer dp):
/* Convert InvalidDsaPointer to NULL. */
if (!DsaPointerIsValid(dp))
return NULL;
So unless dsa_allocate() can ensure never returns InvalidDsaPointer, there is risk of SegV.
In fact the function dsa_allocate() does return InvalidDsaPointer in some cases.
So, maybe should we add pointer check in all places where dsa_get_address is called. Comments?
________________________________
发件人: Mikhail Kot <mikhail(dot)kot(at)databricks(dot)com>
已发送: 2025 年 9 月 03 日 星期三 04:09
收件人: pgsql-hackers(at)lists(dot)postgresql(dot)org <pgsql-hackers(at)lists(dot)postgresql(dot)org>
抄送: to(at)myrrc(dot)dev <to(at)myrrc(dot)dev>
主题: Fix segfault while accessing half-initialized hash table in pgstat_shmem.c
Hi,
I've encountered the following segmentation fault lately. It happens when
Postgres is experiencing high memory pressure. There are multiple OOM errors in
the log as well.
Core was generated by `postgres: neondb_owner neondb ::1(46658) BIND
'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 pg_atomic_read_u32_impl (ptr=0x8) at
../../../../src/include/port/atomics/generic.h:48
#1 pg_atomic_read_u32 (ptr=0x8) at ../../../../src/include/port/atomics.h:239
#2 LWLockAttemptLock (lock=lock(at)entry=0x4,
mode=mode(at)entry=LW_EXCLUSIVE) at lwlock.c:821
#3 0x000056446bce129f in LWLockConditionalAcquire (lock=0x4,
mode=mode(at)entry=LW_EXCLUSIVE) at lwlock.c:1386
#4 0x000056446bd0bacf in pgstat_lock_entry
(entry_ref=entry_ref(at)entry=0x56446d9f4340, nowait=nowait(at)entry=true)
at pgstat_shmem.c:625
#5 0x000056446bd0a3c9 in pgstat_relation_flush_cb
(entry_ref=0x56446d9f4340, nowait=<optimized out>) at
pgstat_relation.c:794
#6 0x000056446bd069f5 in pgstat_flush_pending_entries
(nowait=<optimized out>) at pgstat.c:1217
#7 pgstat_report_stat (force=<optimized out>, force(at)entry=false) at
pgstat.c:658
#8 0x000056446bcf16c1 in PostgresMain (dbname=<optimized out>,
username=<optimized out>) at postgres.c:4623
#9 0x000056446bc716b3 in BackendRun (port=<optimized out>,
port=<optimized out>) at postmaster.c:4465
#10 BackendStartup (port=<optimized out>) at postmaster.c:4193
#11 ServerLoop () at postmaster.c:1782
#12 0x000056446bc726ea in PostmasterMain (argc=argc(at)entry=3,
argv=argv(at)entry=0x56446cd803b0) at postmaster.c:1466
#13 0x000056446b9d5a00 in main (argc=3, argv=0x56446cd803b0) at main.c:238
The error originates from pgstat_shmem.c file where shhashent is left in
half-initialized state if pgstat_init_entry(), calling dsa_allocate0(), errors
out with OOM. Then shhashent causes a segmentation fault on access. I propose a
patch which solves this issue. The patch is for main branch, but the code is
nearly identical in Postgres 13-17 so I suggest backporting it to other
supported versions.
The patch changes pgstat_init_entry()'s behaviour, returning NULL if memory
allocation failed. It also adds sanity checks to routines accepting arguments
returned by pgstat_init_entry().
Reproducing this behaviour is tricky, because under OOM Postgres doesn't
necessarily reach the condition where specific dsa_allocate0() call errors.
From | Date | Subject | |
---|---|---|---|
Next Message | Bertrand Drouvot | 2025-09-03 07:33:37 | Re: Get rid of pgstat_count_backend_io_op*() functions |
Previous Message | Chao Li | 2025-09-03 07:11:03 | Re: SQL:2023 JSON simplified accessor support |