Re: BUG #15395: Assert failure when using CURRENT OF with inheritance

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: mat(at)timescale(dot)com
Cc: pgsql-bugs(at)lists(dot)postgresql(dot)org
Subject: Re: BUG #15395: Assert failure when using CURRENT OF with inheritance
Date: 2018-09-23 18:08:11
Message-ID: 25035.1537726091@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

=?utf-8?q?PG_Bug_reporting_form?= <noreply(at)postgresql(dot)org> writes:
> The following SQL script causes an Assert failure if PostgreSQL is compiled
> with asserts.

Nice catch! This has evidently been broken for a long time.

> When run on a PostgreSQL instance with Asserts turned off,
> this script seems to work as expected. Thus, I suspect that it may be safe
> to just remove the Assert at execCurrent.c:239 but am not familiar enough
> with this part of the code to be sure.

No, I don't think that's a great fix. After tracing through it, I can
see that what's going wrong is:

1. The plan tree for the cursor looks like

EXPLAIN DECLARE current_check_cursor SCROLL CURSOR FOR SELECT * FROM current_check;
QUERY PLAN
--------------------------------------------------------------------------
Append (cost=0.00..58.11 rows=2541 width=36)
-> Seq Scan on current_check (cost=0.00..0.00 rows=1 width=36)
-> Seq Scan on current_check_1 (cost=0.00..22.70 rows=1270 width=36)
-> Seq Scan on current_check_2 (cost=0.00..22.70 rows=1270 width=36)
(4 rows)

2. The "FETCH ABSOLUTE 2" command runs through the single tuple available
from the seqscans on current_check and current_check_1, and then stops
on the first tuple available from current_check_2.

3. The "FETCH ABSOLUTE 1" command rewinds the executor and fetches the
first available tuple, which is in current_check_1.

4. At this point, the SeqScan node for current_check has been run to
completion, so its ScanTupleSlot is empty. The node for current_check_1
is correctly positioned on a tuple. The node for current_check_2 has
not been visited yet, so its state is as ExecReScan left it. The trouble
is that the only thing that's happened to its ScanTupleSlot is where
heapam.c's initscan() does

ItemPointerSetInvalid(&scan->rs_ctup.t_self);

It's only rather accidental that the plan node's ScanTupleSlot is
pointing at that at all, but it is. So execCurrent.c sees an apparently
valid scan tuple containing an invalid t_self, and it spits up, which
I think is entirely a good thing.

I think that the correct fix is to ensure that the ScanTupleSlot gets
cleared during ExecReScan, which fortunately is pretty easy to mechanize
because all scan node types call ExecScanReScan. So the execScan.c
part of the patch below seems to be enough to resolve the reported case.

However, we're not really out of the woods yet, because I noticed that
places like ExecReScanAppend will postpone calling ExecReScan if there's
a chgParam flag set on their sub-nodes. This means that we can't
necessarily rely on ExecutorRewind performing an immediate call of
ExecReScan for every scan node in the plan. It seems to me, therefore,
that we need to teach execCurrent.c to check for pending-rescan flags
and not believe that a node is really on a tuple if it finds such a
flag at or above the node. The execCurrent.c part of the attached patch
does that.

I've not tried yet to create an actual test case for the chgParam-based
failure. It's definitely possible that that problem is only hypothetical
at the moment because cursor plans that would satisfy search_plan_tree
would be too simple to contain any such flags. But I think we'd better
add that logic anyway.

regards, tom lane

Attachment Content-Type Size
fix-bug-15395.patch text/x-diff 5.2 KB

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Sergei Kornilov 2018-09-23 18:27:10 Re: BUG #15396: initdb emits wrong comment for range for effective_io_concurrency
Previous Message PG Bug reporting form 2018-09-23 16:06:04 BUG #15396: initdb emits wrong comment for range for effective_io_concurrency