Re: generic plans and "initial" pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jacob Champion <jchampion(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: generic plans and "initial" pruning
Date: 2023-08-11 05:31:03
Message-ID: CA+HiwqFRJ9NsF5s_Yno3kQ4rLtkWxb86fikeUdjseub8j8rHeA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Aug 9, 2023 at 1:05 AM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Tue, Aug 8, 2023 at 10:32 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> > But should ExecInitNode() subroutines return the partially initialized
> > PlanState node or NULL on detecting invalidation? If I'm
> > understanding how you think this should be working correctly, I think
> > you mean the former, because if it were the latter, ExecInitNode()
> > would end up returning NULL at the top for the root and then there's
> > nothing to pass to ExecEndNode(), so no way to clean up to begin with.
> > In that case, I think we will need to adjust ExecEndNode() subroutines
> > to add `if (node->ps.ps_ResultTupleSlot)` in the above code, for
> > example. That's something Tom had said he doesn't like very much [1].
>
> Yeah, I understood Tom's goal as being "don't return partially
> initialized nodes."
>
> Personally, I'm not sure that's an important goal. In fact, I don't
> even think it's a desirable one. It doesn't look difficult to audit
> the end-node functions for cases where they'd fail if a particular
> pointer were NULL instead of pointing to some real data, and just
> fixing all such cases to have NULL-tests looks like purely mechanical
> work that we are unlikely to get wrong. And at least some cases
> wouldn't require any changes at all.
>
> If we don't do that, the complexity doesn't go away. It just moves
> someplace else. Presumably what we do in that case is have
> ExecInitNode functions undo any initialization that they've already
> done before returning NULL. There are basically two ways to do that.
> Option one is to add code at the point where they return early to
> clean up anything they've already initialized, but that code is likely
> to substantially duplicate whatever the ExecEndNode function already
> knows how to do, and it's very easy for logic like this to get broken
> if somebody rearranges an ExecInitNode function down the road.

Yeah, I too am not a fan of making ExecInitNode() clean up partially
initialized nodes.

> Option
> two is to rearrange the ExecInitNode functions now, to open relations
> or recurse at the beginning, so that we discover the need to fail
> before we initialize anything. That restricts our ability to further
> rearrange the functions in future somewhat, but more importantly,
> IMHO, it introduces more risk right now. Checking that the ExecEndNode
> function will not fail if some pointers are randomly null is a lot
> easier than checking that changing the order of operations in an
> ExecInitNode function breaks nothing.
>
> I'm not here to say that we can't do one of those things. But I think
> adding null-tests to ExecEndNode functions looks like *far* less work
> and *way* less risk.

+1

> There's a second issue here, too, which is when we abort ExecInitNode
> partway through, how do we signal that? You're rightly pointing out
> here that if we do that by returning NULL, then we don't do it by
> returning a pointer to the partially initialized node that we just
> created, which means that we either need to store those partially
> initialized nodes in a separate data structure as you propose to do in
> 0001,
>
> or else we need to pick a different signalling convention. We
> could change (a) ExecInitNode to have an additional argument, bool
> *kaboom, or (b) we could make it return bool and return the node
> pointer via a new additional argument, or (c) we could put a Boolean
> flag into the estate and let the function signal failure by flipping
> the value of the flag.

The failure can already be detected by seeing that
ExecPlanIsValid(estate) is false. The question is what ExecInitNode()
or any of its subroutines should return once it is. I think the
following convention works:

Return partially initialized state from ExecInit* function where we
detect the invalidation after calling ExecInitNode() on a child plan,
so that ExecEndNode() can recurse to clean it up.

Return NULL from ExecInit* functions where we detect the invalidation
after opening and locking a relation but before calling ExecInitNode()
to initialize a child plan if there's one at all. Even if we may set
things like ExprContext, TupleTableSlot fields, they are cleaned up
independently of the plan tree anyway via the cleanup called with
es_exprcontexts, es_tupleTable, respectively. I even noticed bits
like this in ExecEnd* functions:

- /*
- * Free the exprcontext(s) ... now dead code, see ExecFreeExprContext
- */
-#ifdef NOT_USED
- ExecFreeExprContext(&node->ss.ps);
- if (node->ioss_RuntimeContext)
- FreeExprContext(node->ioss_RuntimeContext, true);
-#endif

So, AFAICS, ExprContext, TupleTableSlot cleanup in ExecNode* functions
is unnecessary but remain around because nobody cared about and got
around to getting rid of it.

> If we do any of those things, then as far as I
> can see 0001 is unnecessary. If we do none of them but also avoid
> creating partially initialized nodes by one of the two techniques
> mentioned two paragraphs prior, then 0001 is also unnecessary. If we
> do none of them but do create partially initialized nodes, then we
> need 0001.
>
> So if this were a restaurant menu, then it might look like this:
>
> Prix Fixe Menu (choose one from each)
>
> First Course - How do we clean up after partial initialization?
> (1) ExecInitNode functions produce partially initialized nodes
> (2) ExecInitNode functions get refactored so that the stuff that can
> cause early exit always happens first, so that no cleanup is ever
> needed
> (3) ExecInitNode functions do any required cleanup in situ
>
> Second Course - How do we signal that initialization stopped early?
> (A) Return NULL.
> (B) Add a bool * out-parmeter to ExecInitNode.
> (C) Add a Node * out-parameter to ExecInitNode and change the return
> value to bool.
> (D) Add a bool to the EState.
> (E) Something else, maybe.
>
> I think that we need 0001 if we choose specifically (1) and (A). My
> gut feeling is that the least-invasive way to do this project is to
> choose (1) and (D). My second choice would be (1) and (C), and my
> third choice would be (1) and (A). If I can't have (1), I think I
> prefer (2) over (3), but I also believe I prefer hiding in a deep hole
> to either of them. Maybe I'm not seeing the whole picture correctly
> here, but both (2) and (3) look awfully painful to me.

I think what I've ended up with in the attached 0001 (WIP) is both
(1), (2), and (D). As mentioned above, (D) is implemented with the
ExecPlanStillValid() function.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v45-0006-Track-opened-range-table-relations-in-a-List-in-.patch application/octet-stream 2.4 KB
v45-0003-Add-field-to-store-parent-relids-to-Append-Merge.patch application/octet-stream 21.2 KB
v45-0005-Delay-locking-of-child-tables-in-cached-plans-un.patch application/octet-stream 70.1 KB
v45-0004-Set-inFromCl-to-false-in-child-table-RTEs.patch application/octet-stream 3.7 KB
v45-0002-Refactoring-to-move-ExecutorStart-calls-to-be-ne.patch application/octet-stream 25.8 KB
v45-0001-Add-support-for-allowing-ExecInitNode-to-detect-.patch application/octet-stream 52.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-08-11 05:48:09 Re: [PoC] pg_upgrade: allow to upgrade publisher node
Previous Message Noah Misch 2023-08-11 05:29:44 Re: Inconsistent results with libc sorting on Windows