Re: Fix missing EvalPlanQual recheck for TID scans

From: "Sophie Alpert" <pg(at)sophiebits(dot)com>
To: "Chao Li" <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: "David Rowley" <dgrowleyml(at)gmail(dot)com>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Fix missing EvalPlanQual recheck for TID scans
Date: 2025-09-13 22:12:04
Message-ID: b163dce8-b79f-4ed2-a20d-ece2d1a13ab6@app.fastmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 9, 2025 at 10:18 PM, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
> No, that’s not true. If you extend David’s procedure and use 3 sessions to reproduce the problem, s1 update (0,1), s2 update (0,2) and s3 update (0,1) or (0,2), and trace the backend process of s3, you will see every time when TidRecheck() is called, “node” is brand new, so it has to call TidListEval(node) every time.

After more investigation I agree you are correct here; in a single query, ReScan is called once for each subsequent tuple being rechecked. (ExecUpdate calls EvalPlanQualBegin which always sets rcplanstate->chgParam.)

On Wed, Sep 10, 2025 at 10:30 PM, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
> But, why can't we make TidRecheck() to simplify return FALSE?
>
> Looks like the only case where TidRecheck() is called is a concurrent transaction upadated the row, and in that case, ctid have must be changed.

Although returning FALSE is clever, it is only correct if several other unstated preconditions for the function hold, which I am loath to rely on.

And indeed, like I mentioned in my previous message, my isolation test `permutation tid1 tidsucceed2 c1 c2 read` from eval-plan-qual.spec in my patch will fail if Recheck were to return false in this case. Though somewhat contrived, you can imagine this happening with multiple sessions driven by the same application:

setup: two rows exist with ctid=(0,1) and (0,2)
S1: BEGIN;
S2: BEGIN;
S1: UPDATE WHERE ctid=(0,1) RETURNING ctid;
-- returns (0,3), which the application uses in the next query from another session:
S2: UPDATE WHERE ctid=(0,1) OR ctid=(0,3); -- statement snapshot sees (0,1); recheck will see (0,3) and should pass
S1: COMMIT;
S2: COMMIT;

I would not defend this as good code from an application developer but the behavior is observable. So I understand it would be best to match the enable_tidscan = off behavior, which my existing strategy more verifiably does. Of course if the team disagrees with me then I will defer to everyone's better judgement.

I hope it is rare that many tuples need to be rechecked in a single query but if this is common, then I would suggest it would be better to rework the generic EPQ machinery so that EPQ nodes do not need to be not reset for every row, rather than depending on subtle implicit guarantees in the TidRecheck code.

Sophie

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2025-09-14 01:03:06 Re: Adding skip scan (including MDAM style range skip scan) to nbtree
Previous Message Alexander Korotkov 2025-09-13 19:31:32 Re: Implement waiting for wal lsn replay: reloaded