From: | Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> |
---|---|
To: | Jerry Sievers <gsievers19(at)comcast(dot)net> |
Cc: | pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: SegFault on 9.6.14 |
Date: | 2019-07-16 00:34:49 |
Message-ID: | 20190716003449.fjegtxinhrqubysu@development |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Jul 15, 2019 at 07:22:55PM -0500, Jerry Sievers wrote:
>Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> writes:
>
>> On Mon, Jul 15, 2019 at 06:48:05PM -0500, Jerry Sievers wrote:
>>
>>>Greetings Hackers.
>>>
>>>We have a reproduceable case of $subject that issues a backtrace such as
>>>seen below.
>>>
>>>The query that I'd prefer to sanitize before sending is <30 lines of at
>>>a glance, not terribly complex logic.
>>>
>>>It nonetheless dies hard after a few seconds of running and as expected,
>>>results in an automatic all-backend restart.
>>>
>>>Please advise on how to proceed. Thanks!
>>>
>>>bt
>>>#0 initscan (scan=scan(at)entry=0x55d7a7daa0b0, key=0x0, keep_startblock=keep_startblock(at)entry=1 '\001')
>>> at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:233
>>>#1 0x000055d7a72fa8d0 in heap_rescan (scan=0x55d7a7daa0b0, key=key(at)entry=0x0) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/access/heap/heapam.c:1529
>>>#2 0x000055d7a7451fef in ExecReScanSeqScan (node=node(at)entry=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeSeqscan.c:280
>>>#3 0x000055d7a742d36e in ExecReScan (node=0x55d7a7d85100) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:158
>>>#4 0x000055d7a7445d38 in ExecReScanGather (node=node(at)entry=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeGather.c:475
>>>#5 0x000055d7a742d255 in ExecReScan (node=0x55d7a7d84d30) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:166
>>>#6 0x000055d7a7448673 in ExecReScanHashJoin (node=node(at)entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/nodeHashjoin.c:1019
>>>#7 0x000055d7a742d29e in ExecReScan (node=node(at)entry=0x55d7a7d84110) at /build/postgresql-9.6-5O8OLM/postgresql-9.6-9.6.14/build/../src/backend/executor/execAmi.c:226
>>><about 30 lines omitted>
>>>
>>
>> Hmmm, that means it's crashing here:
>>
>> if (scan->rs_parallel != NULL)
>> scan->rs_nblocks = scan->rs_parallel->phs_nblocks; <--- here
>> else
>> scan->rs_nblocks = RelationGetNumberOfBlocks(scan->rs_rd);
>>
>> But clearly, scan is valid (otherwise it'd crash on the if condition),
>> and scan->rs_parallel must me non-NULL. Which probably means the pointer
>> is (no longer) valid.
>>
>> Could it be that the rs_parallel DSM disappears on rescan, or something
>> like that?
>
>No clue but something I just tried was to disable parallelism by setting
>max_parallel_workers_per_gather to 0 and however the query has not
>finished after a few minutes, there is no crash.
>
That might be a hint my rough analysis was somewhat correct. The
question is whether the non-parallel plan does the same thing. Maybe it
picks a plan that does not require rescans, or something like that.
>Please advise.
>
It would be useful to see (a) exacution plan of the query, (b) full
backtrace and (c) a bit of context for the place where it crashed.
Something like (in gdb):
bt full
list
p *scan
regards
--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From | Date | Subject | |
---|---|---|---|
Next Message | Bruce Momjian | 2019-07-16 00:43:08 | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) |
Previous Message | Bruce Momjian | 2019-07-16 00:25:19 | Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS) |