From: | David Geier <geidav(dot)pg(at)gmail(dot)com> |
---|---|
To: | pgsql-hackers(at)lists(dot)postgresql(dot)org |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Subject: | Re: Assertion failure with barriers in parallel hash join |
Date: | 2022-06-02 09:30:31 |
Message-ID: | 165416223138.23516.14510796482570365828.pgcf@coridan.postgresql.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
Hi all,
We recently encountered the same bug in the field. Oleksii Kozlov managed to come up with reproduction steps, which reliably trigger it. Interestingly, the bug does not only manifest as failing assertion, but also as segmentation fault; in builds with disabled and with enabled (!) assertions. So it can crash production environments. We applied the proposed patch v3 from Melanie to the REL_14_3 branch and can confirm that with the patch neither the assertion nor the segmentation fault still occur.
I have also glanced at the code and the implementation looks fine. However, I'm not an expert for the fairly involved hash join state machine.
There seems to be no need for additional documentation.
For completeness here is the stack trace of the segmentation fault.
Like the stack trace from the assertion failure initially shared by Michael and also encountered by us, the stack trace of the segmentation fault also contains ExecParallelHashJoinNewBatch().
#9 | Source "/opt/src/backend/executor/execMain.c", line 361, in standard_ExecutorRun
| Source "/opt/src/backend/executor/execMain.c", line 1551, in ExecutePlan
Source "/opt/src/include/executor/executor.h", line 257, in ExecProcNode [0x657e4d]
#8 | Source "/opt/src/backend/executor/nodeAgg.c", line 2179, in ExecAgg
Source "/opt/src/backend/executor/nodeAgg.c", line 2364, in agg_retrieve_direct [0x66ba60]
#7 | Source "/opt/src/backend/executor/nodeAgg.c", line 581, in fetch_input_tuple
Source "/opt/src/include/executor/executor.h", line 257, in ExecProcNode [0x66d585]
#6 | Source "/opt/src/backend/executor/nodeHashjoin.c", line 607, in ExecParallelHashJoin
| Source "/opt/src/backend/executor/nodeHashjoin.c", line 560, in ExecHashJoinImpl
Source "/opt/src/backend/executor/nodeHashjoin.c", line 1132, in ExecParallelHashJoinNewBatch [0x67a89b]
#5 | Source "/opt/src/backend/storage/ipc/barrier.c", line 242, in BarrierAttach
Source "/opt/src/include/storage/s_lock.h", line 228, in tas [0x7c2a1b]
#4 Object "/lib/x86_64-linux-gnu/libpthread.so.0", at 0x7f4db364841f, in __funlockfile
--
David Geier
(SericeNow)
The new status of this patch is: Ready for Committer
From | Date | Subject | |
---|---|---|---|
Next Message | wangw.fnst@fujitsu.com | 2022-06-02 10:01:37 | RE: Perform streaming logical transactions by background workers and parallel apply |
Previous Message | Etsuro Fujita | 2022-06-02 09:14:47 | Re: doc: CREATE FOREIGN TABLE .. PARTITION OF .. DEFAULT |