Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tender Wang <tndrwang(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>
Subject: Re: generic plans and "initial" pruning
Date: 2026-03-19 17:20:49
Message-ID: CA+HiwqHBxDL=3qQa1f-sBOBZqB88EiVAiagXF3X8Kagpr6Yhpw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 9, 2026 at 1:41 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Sat, Mar 7, 2026 at 6:54 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > Attached is v6 of the patch series. I've been working toward
> > committing this, so I wanted to lay out the ExecutorPrep() design and
> > the key trade-offs before doing so.
> >
> > When a cached generic plan references a partitioned table,
> > GetCachedPlan() locks all partitions upfront via
> > AcquireExecutorLocks(), even those that initial pruning will
> > eliminate. But initial partition pruning only runs later during
> > ExecutorStart(). Moving pruning earlier requires some executor setup
> > (range table, permissions, pruning state), and ExecutorPrep() is the
> > vehicle for that. Unlike the approach reverted in last May, this
> > keeps the CachedPlan itself unchanged -- all per-execution state flows
> > through a separate CachedPlanPrepData that the caller provides.
> >
> > The approach also keeps GetCachedPlan()'s interface
> > backward-compatible: the new CachedPlanPrepData argument is optional.
> > If a caller passes NULL, all partitions are locked as before and
> > nothing changes. This means existing callers and any new code that
> > calls GetCachedPlan() without caring about pruning-aware locking just
> > works.
> >
> > The risk is on the other side: if a caller does pass a
> > CachedPlanPrepData, GetCachedPlan() will lock only the surviving
> > partitions and populate prep_estates with the EStates that
> > ExecutorPrep() created. The caller then must make those EStates
> > available to ExecutorStart() -- via QueryDesc->estate,
> > portal->prep_estates, or the equivalent path for SPI and SQL
> > functions. If it fails to do so, ExecutorStart() will call
> > ExecutorPrep() again, which may compute different pruning results than
> > the original call, potentially expecting locks on relations that were
> > never acquired. The executor would then operate on relations it
> > doesn't hold locks on.
> >
> > So the contract is: if you opt in to pruning-aware locking by passing
> > CachedPlanPrepData, you must complete the pipeline by delivering the
> > prep EStates to the executor. In the current patch, all the call sites
> > that pass a CachedPlanPrepData (portals, SPI, EXECUTE, SQL functions,
> > EXPLAIN) do thread the EStates through correctly, and I've tried to
> > make the plumbing straightforward enough that it's hard to get wrong.
> > But it is a new invariant that didn't exist before, and a caller that
> > gets it wrong would fail silently rather than with an obvious error.
> >
> > To catch such violations, I've added a debug-only check in
> > standard_ExecutorStart() that fires when no prep EState was provided.
> > It iterates over the plan's rtable and verifies that every lockable
> > relation is actually locked. It should always be true if
> > AcquireExecutorLocks() locked everything, but would fail if
> > pruning-aware locking happened upstream and the caller dropped the
> > prep EState. The check is skipped in parallel workers, which acquire
> > relation locks lazily in ExecGetRangeTableRelation().
> >
> > + if (queryDesc->estate == NULL)
> > + {
> > +#ifdef USE_ASSERT_CHECKING
> > + if (!IsParallelWorker())
> > + {
> > + ListCell *lc;
> > +
> > + foreach(lc, queryDesc->plannedstmt->rtable)
> > + {
> > + RangeTblEntry *rte = lfirst_node(RangeTblEntry, lc);
> > +
> > + if (rte->rtekind == RTE_RELATION ||
> > + (rte->rtekind == RTE_SUBQUERY && rte->relid != InvalidOid))
> > + Assert(CheckRelationOidLockedByMe(rte->relid,
> > + rte->rellockmode,
> > + true));
> > + }
> > + }
> > +#endif
> > + queryDesc->estate = ExecutorPrep(queryDesc->plannedstmt,
> > + queryDesc->params,
> > + CurrentResourceOwner,
> > + true,
> > + eflags);
> > + }
> > +#ifdef USE_ASSERT_CHECKING
> > + else
> > + {
> > + /*
> > + * A prep EState was provided, meaning pruning-aware locking
> > + * should have locked at least the unpruned relations.
> > + */
> > + if (!IsParallelWorker())
> > + {
> > + int rtindex = -1;
> > +
> > + while ((rtindex =
> > bms_next_member(queryDesc->estate->es_unpruned_relids,
> > + rtindex)) >= 0)
> > + {
> > + RangeTblEntry *rte = exec_rt_fetch(rtindex, queryDesc->estate);
> > +
> > + Assert(rte->rtekind == RTE_RELATION ||
> > + (rte->rtekind == RTE_SUBQUERY &&
> > + rte->relid != InvalidOid));
> > + Assert(CheckRelationOidLockedByMe(rte->relid,
> > + rte->rellockmode, true));
> > + }
> > + }
> > + }
> > +#endif
> >
> > So the invariant is: if no prep EState was provided, every relation in
> > the plan is locked; if one was provided, at least the unpruned
> > relations are locked. Both are checked in assert builds.
> >
> > I think this covers the main concerns, but I may be missing something.
> > If anyone sees a problem with this approach, I'd like to hear about
> > it.
>
> Here's v7. Some plancache.c changes that I'd made were in the wrong
> patch in v6; this version puts them where they belong.

Attached is an updated set. One more fix: I added an Assert in
SPI_cursor_open_internal()'s !plan->saved path to verify that
prep_estates is NIL. Unsaved plans always take the custom plan path,
so pruning-aware locking never applies, but it's worth guarding
explicitly since the copyObject/ReleaseCachedPlan sequence that
follows would not be safe otherwise. Also changed
SPI_plan_get_cached_plan() to pass NULL for cprep, since it only
returns the CachedPlan pointer and has no way to deliver prep_estates
to anyone.

Stepping back -- the core question is whether running executor logic
(pruning) inside GetCachedPlan() is acceptable at all. The plan cache
and executor have always had a clean boundary: plan cache locks
everything, executor runs. This optimization necessarily crosses that
line, because the information needed to decide which locks to skip
(pruning results) can only come from executor machinery.

The proposed approach has GetCachedPlan() call ExecutorPrep() to do a
limited subset of executor work (range table init, permissions,
pruning), carry the results out through CachedPlanPrepData, and leave
the CachedPlan itself untouched. The executor already has a multi-step
protocol: start/run/end. prep/start/run/end is just a finer
decomposition of what InitPlan() was already doing inside
ExecutorStart().

Of the attached patches, I'm targeting 0001-0003 for commit. 0004 (SQL
function support) and 0005 (parallel worker reuse) are useful
follow-ons but not essential. The optimization works without them for
most cases, and they can be reviewed and committed separately.

If there's a cleaner way to avoid locking pruned partitions without
the plumbing this patch adds, I haven't found it in the year since the
revert. I'd welcome a pointer if you see one. Failing that, I think
this is the right trade-off, but it's a judgment call about where to
hold your nose.

Tom, I'd value your opinion on whether this approach is something
you'd be comfortable seeing in the tree.

--
Thanks, Amit Langote

Attachment Content-Type Size
v8-0005-Reuse-partition-pruning-results-in-parallel-worke.patch application/octet-stream 11.0 KB
v8-0003-Use-pruning-aware-locking-in-cached-plans.patch application/octet-stream 41.1 KB
v8-0004-Make-SQL-function-executor-track-ExecutorPrep-sta.patch application/octet-stream 7.8 KB
v8-0002-Introduce-ExecutorPrep-and-refactor-executor-star.patch application/octet-stream 27.1 KB
v8-0001-Refactor-partition-pruning-initialization-for-cla.patch application/octet-stream 10.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2026-03-19 17:35:34 Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails
Previous Message Jianghua Yang 2026-03-19 17:19:10 Re: [PATCH] Fix fd leak in pg_dump compression backends when dup()+fdopen() fails