Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Thom Brown <thom(at)linux(dot)com>
Cc: Daniel Gustafsson <daniel(at)yesql(dot)se>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jacob Champion <jchampion(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Robert Haas <robertmhaas(at)gmail(dot)com>
Subject: Re: generic plans and "initial" pruning
Date: 2023-08-02 13:39:45
Message-ID: CA+HiwqEP3j25702EeergM7o8GqC79Dx-3gHKnvfa8oRJiXBDgA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

While chatting with Robert about this patch set, he suggested that it
would be better to break out some executor refactoring changes from
the main patch (0003) into a separate patch. To wit, the changes to
make the PlanState tree cleanup in ExecEndPlan() non-recursive by
walking a flat list of PlanState nodes instead of the recursive tree
walk that ExecEndNode() currently does. That allows us to cleanly
handle the cases where the PlanState tree is only partially
constructed when ExecInitNode() detects in the middle of its
construction that the plan tree is no longer valid after receiving and
processing an invalidation message on locking child tables. Or at
least more cleanly than the previously proposed approach of adjusting
ExecEndNode() subroutines for the individual node types to gracefully
handle such partially initialized PlanState trees.

With the new approach, node type specific subroutines of ExecEndNode()
need not close its child nodes, because ExecEndPlan() would close each
node that would have been initialized directly. I couldn't find any
instance of breakage by this decoupling of child node cleanup from
their parent node's cleanup. Comments in ExecEndGather() and
ExecEndGatherMerge() appear to suggest that outerPlan must be closed
before the local cleanup:

void
ExecEndGather(GatherState *node)
{
- ExecEndNode(outerPlanState(node)); /* let children clean up first */
+ /* outerPlan is closed separately. */
ExecShutdownGather(node);
ExecFreeExprContext(&node->ps);

But I don't think there's a problem, because what ExecShutdownGather()
does seems entirely independent of cleanup of outerPlan.

As for the performance impact of initializing the list of initialized
nodes to use during the cleanup phase, I couldn't find a regression,
nor any improvement by replacing the tree walk by linear scan of a
list. Actually, ExecEndNode() is pretty far down in the perf profile
anyway, so the performance difference caused by the patch hardly
matters. See the following contrived example:

create table f();
analyze f;
explain (costs off) select count(*) from f f1, f f2, f f3, f f4, f f5,
f f6, f f7, f f8, f f9, f f10;
QUERY PLAN
------------------------------------------------------------------------------
Aggregate
-> Nested Loop
-> Nested Loop
-> Nested Loop
-> Nested Loop
-> Nested Loop
-> Nested Loop
-> Nested Loop
-> Nested Loop
-> Nested Loop
-> Seq Scan on f f1
-> Seq Scan on f f2
-> Seq Scan on f f3
-> Seq Scan on f f4
-> Seq Scan on f f5
-> Seq Scan on f f6
-> Seq Scan on f f7
-> Seq Scan on f f8
-> Seq Scan on f f9
-> Seq Scan on f f10
(20 rows)

do $$
begin
for i in 1..100000 loop
perform count(*) from f f1, f f2, f f3, f f4, f f5, f f6, f f7, f f8,
f f9, f f10;
end loop;
end; $$;

Times for the DO:

Unpatched:
Time: 756.353 ms
Time: 745.752 ms
Time: 749.184 ms

Patched:
Time: 737.717 ms
Time: 747.815 ms
Time: 753.456 ms

I've attached the new refactoring patch as 0001.

Another change I've made in the main patch is to change the API of
ExecutorStart() (and ExecutorStart_hook) more explicitly to return a
boolean indicating whether or not the plan initialization was
successful. That way seems better than making the callers figure that
out by seeing that QueryDesc.planstate is NULL and/or checking
QueryDesc.plan_valid. Correspondingly, PortalStart() now also returns
true or false matching what ExecutorStart() returned. I suppose this
better alerts any extensions that use the ExecutorStart_hook to fix
their code to do the right thing.

Having extracted the ExecEndNode() change, I'm also starting to feel
inclined to extract a couple of other bits from the main patch as
separate patches, such as moving the ExecutorStart() call from
PortalRun() to PortalStart() for the multi-query portals. I'll do
that in the next version.

Attachment Content-Type Size
v43-0002-Add-field-to-store-parent-relids-to-Append-Merge.patch application/octet-stream 21.2 KB
v43-0001-Make-PlanState-tree-cleanup-non-recursive.patch application/octet-stream 28.2 KB
v43-0005-Track-opened-range-table-relations-in-a-List-in-.patch application/octet-stream 2.4 KB
v43-0003-Set-inFromCl-to-false-in-child-table-RTEs.patch application/octet-stream 3.7 KB
v43-0004-Delay-locking-of-child-tables-in-cached-plans-un.patch application/octet-stream 122.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2023-08-02 13:48:28 Re: Use of additional index columns in rows filtering
Previous Message Amit Kapila 2023-08-02 13:19:08 Re: [PATCH] Reuse Workers and Replication Slots during Logical Replication