Re: BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks

From: Andres Freund <andres(at)anarazel(dot)de>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Alex Aktsipetrov <alex(dot)akts(at)gmail(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #15900: `executor could not find named tuplestore` in triggers with transition table and row locks
Date: 2019-07-11 00:52:02
Message-ID: 20190711005202.nq4hdomsdi7ml4fr@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi,

On 2019-07-09 11:38:13 -0400, Tom Lane wrote:
> Thomas Munro <thomas(dot)munro(at)gmail(dot)com> writes:
> >> Alex's repro doesn't work on 11 though,
> >> because EPQ is not entered at all. Which raises the question: why do
> >> we need to enter EPQ after commit ad0bda5d on 12/master, for a row
> >> that hasn't been updated by anyone else?
>
> > Explanation: since ad0bda5d24ea, ExecLockRows() always calls
> > EvalPlanQualBegin() which initialises the plan state, and in this case
> > ExecInitNamedTuplestoreScan() errors out due to the bug. Before, you
> > needed the right concurrency scenario (epq_needed) before we did that,
> > as the reporter of bug #15720 discovered.
>
> I'm quite desperately unhappy about this observation, because
> EvalPlanQualBegin is a *large* amount of overhead that is usually
> unnecessary, and is now going to be paid for *every locked row*
> whether there's any conflict on it or not. I do not find that
> acceptable. Why is it necessary to do this before finding that
> there's an update conflict?

Two main reasons:

Previously we referenced tuples from LockRowsState->lr_curtuples, and
the management of that was pretty tightly interlinked with EPQ, and only
worked for heap tuples.Keeping that scheme would have been somewhat
complicated to continue to maintain.

Secondly, previously the tuple fetched by heap_lock_tuple() was just
stored in a local variable - but now that happens via a slot (as we
otherwise cannot reasonably handle things like heap wanting to return a
pinned buffer, and others not). As we potentially need to lock rows
from multiple tables, we'd need multiple slots suitable to lock/fetch those
rows.

EPQ already needed similar infrastructure internally - and those tuples
were fetched and retained for EPQ's benefit. So it seemed sensible to
just use the slots from EPQ. Note that we don't need
EvalPlanQualFetch() anymore, as it's work is done inside the AM -
obviously there's no AM independent way to perform correct ctid chasing.

Which large overhead you mean is going to be paid for "*every locked
row*"? EvalPlanQualBegin() ought to be fairly fast for repeated calls in
the same node - although I do admit that it'd be a lot nicer if the
ExecSetParamPlanMulti() work wouldn't need be redone.

I think it might be reasonable to split EvalPlanQualBegin() (and perhaps
EvalPlanQualStart()) into two pieces: One to just have a minimal
EPQState, with valid slots, and one to get ready to actually run EPQ?

Unfortunately I'm going to be unreachable pretty soon till Monday
(hiking without any network access), so I won't be immediately able to respond.

Greetings,

Andres Freund

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Peter Geoghegan 2019-07-11 07:18:46 Re: BUG #15896: pg_upgrade from 10-or-earlier: TRAP: FailedAssertion(»!(metad->btm_version >= 3)«
Previous Message Michael Paquier 2019-07-10 23:58:30 Re: BUG #15888: Bogus "idle in transaction" state for logical decoding client after creating a slot