| From: | Amit Langote <amitlangote09(at)gmail(dot)com> |
|---|---|
| To: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tender Wang <tndrwang(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Tomas Vondra <tomas(at)vondra(dot)me>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com> |
| Subject: | Re: generic plans and "initial" pruning |
| Date: | 2026-03-26 09:24:28 |
| Message-ID: | CA+HiwqGq=xQvE0oCeOX_oXWq2iyNs5q9UwopyQ2uXF2kJPXTDg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Wed, Mar 25, 2026 at 4:39 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Fri, Mar 20, 2026 at 2:20 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > On Mon, Mar 9, 2026 at 1:41 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > Stepping back -- the core question is whether running executor logic
> > (pruning) inside GetCachedPlan() is acceptable at all. The plan cache
> > and executor have always had a clean boundary: plan cache locks
> > everything, executor runs. This optimization necessarily crosses that
> > line, because the information needed to decide which locks to skip
> > (pruning results) can only come from executor machinery.
> >
> > The proposed approach has GetCachedPlan() call ExecutorPrep() to do a
> > limited subset of executor work (range table init, permissions,
> > pruning), carry the results out through CachedPlanPrepData, and leave
> > the CachedPlan itself untouched. The executor already has a multi-step
> > protocol: start/run/end. prep/start/run/end is just a finer
> > decomposition of what InitPlan() was already doing inside
> > ExecutorStart().
> >
> > Of the attached patches, I'm targeting 0001-0003 for commit. 0004 (SQL
> > function support) and 0005 (parallel worker reuse) are useful
> > follow-ons but not essential. The optimization works without them for
> > most cases, and they can be reviewed and committed separately.
> >
> > If there's a cleaner way to avoid locking pruned partitions without
> > the plumbing this patch adds, I haven't found it in the year since the
> > revert. I'd welcome a pointer if you see one. Failing that, I think
> > this is the right trade-off, but it's a judgment call about where to
> > hold your nose.
> >
> > Tom, I'd value your opinion on whether this approach is something
> > you'd be comfortable seeing in the tree.
>
> Attached is an updated set with some cleanup after another pass.
>
> - Removed ExecCreatePartitionPruneStates() from 0001. In 0001-0003,
> ExecDoInitialPruning() handles both setup and pruning internally; the
> split isn't needed yet.
>
> - Tightened commit messages to describe what each commit does now, not
> what later commits will use it for. In particular, 0002 is upfront
> that the portal/SPI/EXPLAIN plumbing is scaffolding that 0003 lights
> up.
>
> - Updated setrefs.c comment for firstResultRels to drop a blanket
> claim about one ModifyTable per query level.
>
> As before, 0001-0003 is the focus, maybe 0004 which teaches the new
> GetCachedPlan() pruning-aware contract to its relatively new user in
> function.c.
While reviewing the patch more carefully, I realized there's a
correctness issue when rule rewriting causes a single statement to
expand into multiple PlannedStmts in one CachedPlan.
PortalRunMulti() executes those statements sequentially, with
CommandCounterIncrement() between them, so Q2's ExecutorStart()
normally sees the effects of Q1.
With the patch, though, AcquireExecutorLocksUnpruned() runs
ExecutorPrep() on all PlannedStmts in one pass during GetCachedPlan(),
before any statement executes. If a later statement has
initial-pruning expressions that read data modified by an earlier one,
pruning can see stale results.
There's also a memory lifetime issue: PortalRunMulti() calls
MemoryContextDeleteChildren(portalContext) between statements, which
destroys EStates prepared for later statements.
Here's a concrete case demonstrating the semantic issue:
create table multistmt_pt (a int, b int) partition by list (a);
create table multistmt_pt_1 partition of multistmt_pt for values in (1);
create table multistmt_pt_2 partition of multistmt_pt for values in (2);
insert into multistmt_pt values (1, 0), (2, 0);
create table prune_config (val int);
insert into prune_config values (1);
create function get_prune_val() returns int as $$
select val from prune_config;
$$ language sql stable;
-- rule action runs first, updating prune_config before the
-- original statement's pruning would normally be evaluated
create rule config_upd_rule as on update to multistmt_pt
do also update prune_config set val = 2;
set plan_cache_mode to force_generic_plan;
prepare multi_q as
update multistmt_pt set b = b + 1 where a = get_prune_val();
execute multi_q; -- creates the generic plan
-- reset for the real test
update prune_config set val = 1;
update multistmt_pt set b = 0;
-- second execute reuses the plan
execute multi_q;
select * from multistmt_pt order by a;
Without the patch: the rule action updates prune_config to val=2
first, then after CCI the original statement's initial pruning calls
get_prune_val(), gets 2, prunes to multistmt_pt_2, and updates it
correctly: (1, 0), (2, 1).
With the patch as it stood: both statements' pruning runs during
GetCachedPlan() before either executes. The original statement's
pruning sees val=1, prunes to multistmt_pt_1, and multistmt_pt_2 is
never touched.
The fix is to skip pruning-aware locking for CachedPlans containing
multiple PlannedStmts, falling back to locking all partitions.
Single-statement plans are unchanged.
Since multi-statement plans are now excluded, CachedPlanPrepData no
longer needs a list of EStates -- it carries a single EState pointer.
This simplifies the plumbing throughout: PortalData,
PortalDefineQuery, SPI, and EXPLAIN all pass a single optional EState
instead of walking parallel lists. The next_prep_estate() helper is
gone.
Attached is the updated set.
--
Thanks, Amit Langote
| Attachment | Content-Type | Size |
|---|---|---|
| v10-0005-Reuse-partition-pruning-results-in-parallel-work.patch | application/octet-stream | 15.8 KB |
| v10-0001-Refactor-executor-s-initial-partition-pruning-se.patch | application/octet-stream | 7.3 KB |
| v10-0002-Introduce-ExecutorPrep-and-refactor-executor-sta.patch | application/octet-stream | 23.5 KB |
| v10-0003-Use-pruning-aware-locking-in-cached-plans.patch | application/octet-stream | 47.3 KB |
| v10-0004-Make-SQL-function-executor-track-ExecutorPrep-st.patch | application/octet-stream | 7.7 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | SATYANARAYANA NARLAPURAM | 2026-03-26 09:48:42 | Re: Introduce XID age based replication slot invalidation |
| Previous Message | John Naylor | 2026-03-26 09:22:07 | Re: Adjust error message for CREATE STATISTICS to account for expressions |