Re: generic plans and "initial" pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Thom Brown <thom(at)linux(dot)com>, Daniel Gustafsson <daniel(at)yesql(dot)se>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jacob Champion <jchampion(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: generic plans and "initial" pruning
Date: 2023-08-07 15:36:32
Message-ID: CA+Tgmoan++3tDKjBysnp3QdJc_f+zKYFezXQw9jUPvRY=kZ9Cg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 3, 2023 at 4:37 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Here's a patch set where the refactoring to move the ExecutorStart()
> calls to be closer to GetCachedPlan() (for the call sites that use a
> CachedPlan) is extracted into a separate patch, 0002. Its commit
> message notes an aspect of this refactoring that I feel a bit nervous
> about -- needing to also move the CommandCounterIncrement() call from
> the loop in PortalRunMulti() to PortalStart() which now does
> ExecutorStart() for the PORTAL_MULTI_QUERY case.

I spent some time today reviewing 0001. Here are a few thoughts and
notes about things that I looked at.

First, I wondered whether it was really adequate for ExecEndPlan() to
just loop over estate->es_plan_nodes and call it good. Put
differently, is it possible that we could ever have more than one
relevant EState, say for a subplan or an EPQ execution or something,
so that this loop wouldn't cover everything? I found nothing to make
me think that this is a real danger.

Second, I wondered whether the ordering of cleanup operations could be
an issue. Right now, a node can position cleanup code before, after,
or both before and after recursing to child nodes, whereas with this
design change, the cleanup code will always be run before recursing to
child nodes. Here, I think we have problems. Both ExecGather and
ExecEndGatherMerge intentionally clean up the children before the
parent, so that the child shutdown happens before
ExecParallelCleanup(). Based on the comment and commit
acf555bc53acb589b5a2827e65d655fa8c9adee0, this appears to be
intentional, and you can sort of see why from looking at the stuff
that happens in ExecParallelCleanup(). If the instrumentation data
vanishes before the child nodes have a chance to clean things up,
maybe EXPLAIN ANALYZE won't reflect that instrumentation any more. If
the DSA vanishes, maybe we'll crash if we try to access it. If we
actually reach DestroyParallelContext(), we're just going to start
killing the workers. None of that sounds like what we want.

The good news, of a sort, is that I think this might be the only case
of this sort of problem. Most nodes recurse at the end, after doing
all the cleanup, so the behavior won't change. Moreover, even if it
did, most cleanup operations look pretty localized -- they affect only
the node itself, and not its children. A somewhat interesting case is
nodes associated with subplans. Right now, because of the coding of
ExecEndPlan, nodes associated with subplans are all cleaned up at the
very end, after everything that's not inside of a subplan. But with
this change, they'd get cleaned up in the order of initialization,
which actually seems more natural, as long as it doesn't break
anything, which I think it probably won't, since as I mention in most
cases node cleanup looks quite localized, i.e. it doesn't care whether
it happens before or after the cleanup of other nodes.

I think something will have to be done about the parallel query stuff,
though. I'm not sure exactly what. It is a little weird that Gather
and Gather Merge treat starting and killing workers as a purely
"private matter" that they can decide to handle without the executor
overall being very much aware of it. So maybe there's a way that some
of the cleanup logic here could be hoisted up into the general
executor machinery, that is, first end all the nodes, and then go
back, and end all the parallelism using, maybe, another list inside of
the estate. However, I think that the existence of ExecShutdownNode()
is a complication here -- we need to make sure that we don't break
either the case where that happen before overall plan shutdown, or the
case where it doesn't.

Third, a couple of minor comments on details of how you actually made
these changes in the patch set. Personally, I would remove all of the
"is closed separately" comments that you added. I think it's a
violation of the general coding principle that you should make the
code look like it's always been that way. Sure, in the immediate
future, people might wonder why you don't need to recurse, but 5 or 10
years from now that's just going to be clutter. Second, in the cases
where the ExecEndNode functions end up completely empty, I would
suggest just removing the functions entirely and making the switch
that dispatches on the node type have a switch case that lists all the
nodes that don't need a callback here and say /* Nothing do for these
node types */ break;. This will save a few CPU cycles and I think it
will be easier to read as well.

Fourth, I wonder whether we really need this patch at all. I initially
thought we did, because if we abandon the initialization of a plan
partway through, then we end up with a plan that is in a state that
previously would never have occurred, and we still have to be able to
clean it up. However, perhaps it's a difference without a distinction.
Say we have a partial plan tree, where not all of the PlanState nodes
ever got created. We then just call the existing version of
ExecEndPlan() on it, with no changes. What goes wrong? Sure, we might
call ExecEndNode() on some null pointers where in the current world
there would always be valid pointers, but ExecEndNode() will handle
that just fine, by doing nothing for those nodes, because it starts
with a NULL-check.

Another alternative design might be to switch ExecEndNode to use
planstate_tree_walker to walk the node tree, removing the walk from
the node-type-specific functions as in this patch, and deleting the
end-node functions that are no longer required altogether, as proposed
above. I somehow feel that this would be cleaner than the status quo,
but here again, I'm not sure we really need it. planstate_tree_walker
would just pass over any NULL pointers that it found without doing
anything, but the current code does that too, so while this might be
more beautiful than what we have now, I'm not sure that there's any
real reason to do it. The fact that, like the current patch, it would
change the order in which nodes are cleaned up is also an issue -- the
Gather/Gather Merge ordering issues might be easier to handle this way
with some hack in ExecEndNode() than they are with the design you have
now, but we'd still have to do something about them, I believe.

Sorry if this is a bit of a meandering review, but those are my thoughts.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2023-08-07 15:44:53 Re: generic plans and "initial" pruning
Previous Message Drouvot, Bertrand 2023-08-07 15:11:50 Re: WIP: new system catalog pg_wait_event