| From: | Alexander Lakhin <exclusion(at)gmail(dot)com> |
|---|---|
| To: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
| Cc: | pgsql-bugs(at)lists(dot)postgresql(dot)org, Heikki Linnakangas <hlinnaka(at)iki(dot)fi> |
| Subject: | Re: BUG #18374: Printing memory contexts on OOM condition might lead to segmentation fault |
| Date: | 2026-05-16 19:00:00 |
| Message-ID: | 89884b8a-2a50-4f6d-9e9c-223c6741643e@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hello Tom,
03.03.2024 23:39, Tom Lane wrote:
> I wrote:
>> I find this in [1]:
>>
>> The C language stack growth does an implicit mremap. If you want absolute
>> guarantees and run close to the edge you MUST mmap your stack for the
>> largest size you think you will need. For typical stack usage this does
>> not matter much but it's a corner case if you really really care
>>
>> Seems like we need to do some more work at startup to enforce that
>> we have the amount of stack we think we do, if we're on Linux.
> After thinking about that some more, I'm really quite unenthused about
> trying to remap the stack for ourselves. It'd be both platform- and
> architecture-dependent, and I'm afraid it'd introduce as many failure
> modes as it removes. (Notably, I'm not sure we could guarantee
> there's a guard page below the stack.) Since we've not seen reports
> of this failure from the wild, I doubt it's worth the trouble.
I'm not too excited either, but I observed such SIGSEGVs in a
memory-restricted cloud (neon) environment (perhaps it could be considered
the wild), and what looks bad to me is that there is no protection from it
at all. That is, if you can get an out-of-memory error in some environment,
you can also bring the whole server down with the segfault, occasionally or
intentionally.
I researched the subject and found the only way to prevent this -- to
allocate the stack memory (up to max_stack_depth) on postmaster child's
start.
Please look at a test, which triggers the server crash, and a possible
protection.
When running this test on Linux, I'm getting:
PROVE_TESTS="t/099*" make -s check -C src/test/modules/test_misc/
# Testing ulimit -Sv 280000
# out of memory
# Testing ulimit -Sv 1140000
# stack depth limit exceeded
...
# Boundary between 'out of memory' and 'stack depth limit exceeded' found: 283779
...
# Testing ulimit -Sv 275587
# psql:<stdin>:17: server closed the connection unexpectedly
# This probably means the server terminated abnormally
# before or while processing the request.
2026-05-16 20:33:30.724 EEST postmaster[4101481] LOG: client backend (PID 4101496) was terminated by signal 11:
Segmentation fault
2026-05-16 20:33:30.724 EEST postmaster[4101481] DETAIL: Failed process was running: select explainer('execute stmt');
While with echo "preallocate_stack=on" >/tmp/temp.config;
TEMP_CONFIG=/tmp/temp.config PROVE_TESTS="t/099*" make -s check -C src/test/modules/test_misc/
survives the test.
ulimit -Sv is easy to use for the test, but in the wild the restriction
would be rather on the total amount of memory for all (postgres) processes,
so there could be more interesting scenarios...
Catching sigsegv in allocate_stack() is needed to handle correctly the
even worse situation, when there is not enough memory to preallocate stack
even on a process start.
The test and the fix work for me on Linux and FreeBSD.
Yes, this protection has it's price (max_stack_depth * num processes), but
perhaps one who wants to avoid server crashes should have the choice.
Thanks to Heikki for help with making the solution robust.
Best regards,
Alexander
| Attachment | Content-Type | Size |
|---|---|---|
| 099_stack_overflow.pl | application/x-perl | 2.3 KB |
| prevent-segfaults-under-memory-pressure.patch | text/x-patch | 6.3 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Varik Matevosyan | 2026-05-16 22:30:00 | [PATCH] Replace debug-only Asserts with runtime checks in logical replication apply worker |
| Previous Message | PG Bug reporting form | 2026-05-16 11:00:01 | BUG #19482: Recursive QueueFKConstraintValidation() lacks stack depth check |