Re: generic plans and "initial" pruning

From: Thom Brown <thom(at)linux(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tender Wang <tndrwang(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: generic plans and "initial" pruning
Date: 2026-05-27 12:03:24
Message-ID: CAA-aLv5+dSSQ7KKZPgnysnFOTEXkFKYbeqSWk5Qu61_3Vd8aJw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 4 Apr 2026 at 13:11, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> Attached is a redesigned version. While working on the previous
> design, I grew increasingly uncomfortable with CachedPlanPrepData --
> it was smuggling executor state out of GetCachedPlan() through an
> out-parameter, which papered over the real problem: GetCachedPlan()
> was doing too much. The main change in this version is architectural:
> GetCachedPlan() no longer acquires execution locks. Callers now own
> that responsibility, which is natural because each call site iterates
> stmt_list differently and manages execution state in its own way --
> and it lets them choose between conservative lock-all and
> pruning-aware locking where appropriate.
>
> Non-portal call sites remain on the conservative path for now.
> _SPI_execute_plan requires care around snapshot setup, which happens
> after plan fetch rather than before. SQL functions have a different
> issue: init_execution_state() fetches the plan while postquel_start()
> handles execution, with execution_state containers in between, making
> it harder to thread a prepped QueryDesc through. The portal path and
> EXPLAIN EXECUTE cover the most common
> prepared-statement-with-partitions workloads; the remaining sites can
> be converted incrementally.
>
> This is now starting to feel closer to what Tom suggested back in
> January 2023 [1], where he proposed getting rid of
> AcquireExecutorLocks() inside GetCachedPlan() entirely and pushing
> lock acquisition out to callers. He noted that "we'd be pushing the
> responsibility for looping back and re-planning out to fairly
> high-level calling code" and that "we'd definitely be changing some
> fundamental APIs." That is the direction I came around to over the
> last couple of weeks while wrestling with CachedPlanPrepData. The
> reverted approach also tried to follow Tom's direction but moved
> locking into ExecutorStart(), which forced it to handle plan
> invalidation from inside the executor by mutating the CachedPlan
> in-place. This version moves locking out to the callers instead, so
> the executor and plan cache never reach into each other.
>
> The series is now four patches:
>
> 0001: Move execution lock acquisition out of GetCachedPlan(). Adds
> AcquireExecutorLocks() as a caller-facing function with validity check
> and retry. Adds PortalLockCachedPlan() in pquery.c to centralize the
> portal retry logic. All callers are converted. No behavioral change.
>
> 0002: Refactor executor's initial partition pruning setup. Cleanup
> only, no behavioral change.
>
> 0003: Introduce ExecutorPrep() and refactor executor startup. Factors
> range table init, permission checks, and initial pruning out of
> InitPlan(). Scaffolding for 0004; all callers still go through the
> normal ExecutorStart() path.
>
> 0004: Use pruning-aware locking for single-statement cached plans.
> Adds ExecutorPrepAndLock() which locks unprunable relations, runs
> ExecutorPrep() to determine surviving partitions, then locks only
> those. Extends PortalLockCachedPlan() with a pruning-aware path for
> eligible plans. Multi-statement CachedPlans (from rule rewriting)
> always use conservative locking. In principle, this could be relaxed
> if the planner can prove that no pruning expression reads state
> modified by an earlier statement, but that is left for a future patch.
> Includes regression tests.
>
> In case it's not clear, I'm not targeting v19 at this point. I'd like
> to get this into v20 CF1 and would welcome review from anyone
> interested.

After not having looked at this in close to 2 years, I thought I'd
give it another look. Not found any user-facing issues, and I'm liking
seeing so few locks in pg_locks. I can see that with pruning disabled,
the fallback works, pruning-aware locking is working via SPI through
plpgsql, running ALTER between executions and also invalidating
indexes force replans, and it's looking good.

But I also think there might be a bug in patch 0001, but I'd
appreciate checking my reasoning because I'm not fully confident I've
been diligent enough.

When PortalStart() opens a SELECT cursor that's backed by a cached
plan, it does roughly the following. It builds a queryDesc (an
executor-side struct), one of whose fields is a pointer into the plan
tree inside the portal's cached plan. Then it calls
PortalLockCachedPlan() to acquire the necessary locks, and finally
hands the queryDesc over to the executor.

My worry is about what happens if the cached plan turns out to be
stale, for instance because someone ran DDL on a referenced table. In
that case PortalLockCachedPlan() throws the old plan away (via
ReleaseCachedPlan) and fetches a freshly-built replacement, updtating
the portal's own pointers to match. But the queryDesc from earlier
isn't touched. Its plan pointer still references the old, now-released
plan. From what I can see, once that old plan's last reference is
dropped its memory can be freed, which would leave the executor
reading from freed memory in the next step.

The bit I'm least sure about is whether the old plan's memory really
does get reclaimed straight away when its refcount hits zero. If
something keeps it alive longer then this isn't a bug, or at least not
as bad as I'm making out. I had a look but couldn't convince myself
either way from the code alone. To actually hit this you'd need a
cursor on a cached plan, plus an invalidation arriving in the small
window between the portal being set up and the cursor being opened.
The race condition is brief, and I've not been able to hit it in
testing.

The thing that got me thinking this is real: patch 0004 modifies
PortalLockCachedPlan() so that whenever it replans, it also rebuilds
the queryDesc. That's pretty much the fix I'd expect for this, which
makes me suspect somebody hit it at some point. But 0004 only applies
that fix on the new pruning-aware code path, and it was mentioned in
the thread that 0001 to 0003 might land before 0004. If so, master
would carry the bug in the gap between the two.

I suspect a way to deal with it would be to move the CreateQueryDesc
call in the SELECT case to after PortalLockCachedPlan() returns, which
is what the other portal strategies already seem to do. Alternatively,
you could bring 0004's changes in this area into 0001 and have
PortalLockCachedPlan() always rebuild the queryDesc when it replans.

If I've got this wrong and there's some lifetime mechanism I missed
that keeps the old plan's memory alive, then it's a non-issue and I'm
misreading the code. If I have got it wrong, could you please add
comments to make what is going on clearer?

Regards

Thom

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2026-05-27 12:10:01 Re: [PATCH] Release replication slot on error in SQL-callable slot functions
Previous Message Zhijie Hou (Fujitsu) 2026-05-27 11:50:16 Fix race in ReplicationSlotRelease for ephemeral slots