Re: generic plans and "initial" pruning

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Daniel Gustafsson <daniel(at)yesql(dot)se>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jacob Champion <jchampion(at)timescale(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, Thom Brown <thom(at)linux(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Subject: Re: generic plans and "initial" pruning
Date: 2023-09-06 14:20:06
Message-ID: CA+TgmoYP-Aimev3Wa6O10TC3u_eRsODAZo0Lff3Ey-2mHdZAzw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 6, 2023 at 5:12 AM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> Attached updated patches. Thanks for the review.

I think 0001 looks ready to commit. I'm not sure that the commit
message needs to mention future patches here, since this code cleanup
seems like a good idea regardless, but if you feel otherwise, fair
enough.

On 0002, some questions:

- In ExecEndLockRows, is the call to EvalPlanQualEnd a concern? i.e.
Does that function need any adjustment?
- In ExecEndMemoize, should there be a null-test around
MemoryContextDelete(node->tableContext) as we have in
ExecEndRecursiveUnion, ExecEndSetOp, etc.?

I wonder how we feel about setting pointers to NULL after freeing the
associated data structures. The existing code isn't consistent about
doing that, and making it do so would be a fairly large change that
would bloat this patch quite a bit. On the other hand, I think it's a
good practice as a general matter, and we do do it in some ExecEnd
functions.

On 0003, I have some doubt about whether we really have all the right
design decisions in detail here:

- Why have this weird rule where sometimes we return NULL and other
times the planstate? Is there any point to such a coding rule? Why not
just always return the planstate?

- Is there any point to all of these early exit cases? For example, in
ExecInitBitmapAnd, why exit early if initialization fails? Why not
just plunge ahead and if initialization failed the caller will notice
that and when we ExecEndNode some of the child node pointers will be
NULL but who cares? The obvious disadvantage of this approach is that
we're doing a bunch of unnecessary initialization, but we're also
speeding up the common case where we don't need to abort by avoiding a
branch that will rarely be taken. I'm not quite sure what the right
thing to do is here.

- The cases where we call ExecGetRangeTableRelation or
ExecOpenScanRelation are a bit subtler ... maybe initialization that
we're going to do later is going to barf if the tuple descriptor of
the relation isn't what we thought it was going to be. In that case it
becomes important to exit early. But if that's not actually a problem,
then we could apply the same principle here also -- don't pollute the
code with early-exit cases, just let it do its thing and sort it out
later. Do you know what the actual problems would be here if we didn't
exit early in these cases?

- Depending on the answers to the above points, one thing we could
think of doing is put an early exit case into ExecInitNode itself: if
(unlikely(!ExecPlanStillValid(whatever)) return NULL. Maybe Andres or
someone is going to argue that that checks too often and is thus too
expensive, but it would be a lot more maintainable than having similar
checks strewn throughout the ExecInit* functions. Perhaps it deserves
some thought/benchmarking. More generally, if there's anything we can
do to centralize these checks in fewer places, I think that would be
worth considering. The patch isn't terribly large as it stands, so I
don't necessarily think that this is a critical issue, but I'm just
wondering if we can do better. I'm not even sure that it would be too
expensive to just initialize the whole plan always, and then just do
one test at the end. That's not OK if the changed tuple descriptor (or
something else) is going to crash or error out in a funny way or
something before initialization is completed, but if it's just going
to result in burning a few CPU cycles in a corner case, I don't know
if we should really care.

- The "At this point" comments don't give any rationale for why we
shouldn't have received any such invalidation messages. That makes
them fairly useless; the Assert by itself clarifies that you think
that case shouldn't happen. The comment's job is to justify that
claim.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2023-09-06 14:22:24 Re: GUC for temporarily disabling event triggers
Previous Message Tom Lane 2023-09-06 14:00:20 Re: Output affected rows in EXPLAIN