Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tender Wang <tndrwang(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>
Subject: Re: generic plans and "initial" pruning
Date: 2026-03-07 09:54:27
Message-ID: CA+HiwqGFNe7kBkKZm0KtG_CFfw-ciK659SJMGP0CWVaa2q8rmw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Attached is v6 of the patch series. I've been working toward
committing this, so I wanted to lay out the ExecutorPrep() design and
the key trade-offs before doing so.

When a cached generic plan references a partitioned table,
GetCachedPlan() locks all partitions upfront via
AcquireExecutorLocks(), even those that initial pruning will
eliminate. But initial partition pruning only runs later during
ExecutorStart(). Moving pruning earlier requires some executor setup
(range table, permissions, pruning state), and ExecutorPrep() is the
vehicle for that. Unlike the approach reverted in last May, this
keeps the CachedPlan itself unchanged -- all per-execution state flows
through a separate CachedPlanPrepData that the caller provides.

The approach also keeps GetCachedPlan()'s interface
backward-compatible: the new CachedPlanPrepData argument is optional.
If a caller passes NULL, all partitions are locked as before and
nothing changes. This means existing callers and any new code that
calls GetCachedPlan() without caring about pruning-aware locking just
works.

The risk is on the other side: if a caller does pass a
CachedPlanPrepData, GetCachedPlan() will lock only the surviving
partitions and populate prep_estates with the EStates that
ExecutorPrep() created. The caller then must make those EStates
available to ExecutorStart() -- via QueryDesc->estate,
portal->prep_estates, or the equivalent path for SPI and SQL
functions. If it fails to do so, ExecutorStart() will call
ExecutorPrep() again, which may compute different pruning results than
the original call, potentially expecting locks on relations that were
never acquired. The executor would then operate on relations it
doesn't hold locks on.

So the contract is: if you opt in to pruning-aware locking by passing
CachedPlanPrepData, you must complete the pipeline by delivering the
prep EStates to the executor. In the current patch, all the call sites
that pass a CachedPlanPrepData (portals, SPI, EXECUTE, SQL functions,
EXPLAIN) do thread the EStates through correctly, and I've tried to
make the plumbing straightforward enough that it's hard to get wrong.
But it is a new invariant that didn't exist before, and a caller that
gets it wrong would fail silently rather than with an obvious error.

To catch such violations, I've added a debug-only check in
standard_ExecutorStart() that fires when no prep EState was provided.
It iterates over the plan's rtable and verifies that every lockable
relation is actually locked. It should always be true if
AcquireExecutorLocks() locked everything, but would fail if
pruning-aware locking happened upstream and the caller dropped the
prep EState. The check is skipped in parallel workers, which acquire
relation locks lazily in ExecGetRangeTableRelation().

+ if (queryDesc->estate == NULL)
+ {
+#ifdef USE_ASSERT_CHECKING
+ if (!IsParallelWorker())
+ {
+ ListCell *lc;
+
+ foreach(lc, queryDesc->plannedstmt->rtable)
+ {
+ RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
+
+ if (rte->rtekind == RTE_RELATION ||
+ (rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
+ Assert(CheckRelationOidLockedByMe(rte->relid,
+ rte->rellockmode,
+ true));
+ }
+ }
+#endif
+ queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
+ queryDesc->params,
+ CurrentResourceOwner,
+ true,
+ eflags);
+ }
+#ifdef USE_ASSERT_CHECKING
+ else
+ {
+ /*
+ * A prep EState was provided, meaning pruning-aware locking
+ * should have locked at least the unpruned relations.
+ */
+ if (!IsParallelWorker())
+ {
+ int rtindex = -1;
+
+ while ((rtindex =
bms_next_member(queryDesc->estate->es_unpruned_relids,
+ rtindex)) >= 0)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
+
+ Assert(rte->rtekind == RTE_RELATION ||
+ (rte->rtekind == RTE_SUBQUERY &&
+ rte->relid != InvalidOid));
+ Assert(CheckRelationOidLockedByMe(rte->relid,
+ rte->rellockmode, true));
+ }
+ }
+ }
+#endif

So the invariant is: if no prep EState was provided, every relation in
the plan is locked; if one was provided, at least the unpruned
relations are locked. Both are checked in assert builds.

I think this covers the main concerns, but I may be missing something.
If anyone sees a problem with this approach, I'd like to hear about
it.

--
Thanks,
Amit Langote

Attachment Content-Type Size
v6-0004-Use-pruning-aware-locking-in-cached-plans.patch application/octet-stream 37.7 KB
v6-0003-Add-test-for-partition-lock-behavior-with-generic.patch application/octet-stream 5.3 KB
v6-0006-Reuse-partition-pruning-results-in-parallel-worke.patch application/octet-stream 15.9 KB
v6-0005-Make-SQL-function-executor-track-ExecutorPrep-sta.patch application/octet-stream 7.8 KB
v6-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch application/octet-stream 27.6 KB
v6-0001-Refactor-partition-pruning-initialization-for-cla.patch application/octet-stream 10.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alexander Lakhin 2026-03-07 11:00:01 Re: Refactor recovery conflict signaling a little
Previous Message jian he 2026-03-07 09:46:23 Re: UPDATE run check constraints for affected columns only