Re: plan shape work

From: Andrei Lepikhov <lepihov(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Andy Fan <zhihuifan1213(at)163(dot)com>, maciek(at)pganalyze(dot)com, Tomas Vondra <tomas(at)vondra(dot)me>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: plan shape work
Date: 2025-08-27 09:41:00
Message-ID: a5d1190d-42e5-4cb3-82dd-6eedeb5c7329@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19/5/2025 20:01, Robert Haas wrote:
> Hope you find this interesting. If you do, let me know what you think.Thanks for looking in this direction!

Since 2017, we designed features that should 'memorise' the experience
of previous executions, checking the PlanState tree and instrumentation
after execution.
The first dumb prototype you should know - AQO. The following, more
robust code is 'replan' [1]. These two features utilise the core patch,
and I hope this patch can be removed after a few adjustments to the core.
In fact, 'replan' is used to stop execution in the middle (if the
time/memory usage limit is achieved), pass through the state, collect
valuable data and execute again. That's why using data for the
subsequent execution needs a matching query tree, and that data may be
irrelevant for a different set of constants.

The lessons learned during design and after some usage in real life:
1. Subplans should be uniquely identified during the planning. Like you
mentioned, two similar subplans on the same query level may be executed
differently and provide different row numbers to the 'knowledge base'. I
used the subquery_planner hack that generated a unique ID for each subplan.
2. Each node should be identified - I used a kind of signature at each
RelOptInfo node - just a hash generated by restrictinfos and underlying
signatures. This approach needs a create_plan hook to copy the signature
to the Plan nodes.
3. A great source of fluctuations is 'never executed' nodes - because
depending on the constant, the subtree may never be executed or produce
tons of tuples - I resolved it by just ignoring 'never executed'
results, no in-core changes needed.
4. A tree of paths may implement a single RelOptInfo. I have saved the
signature at the top of the Plan node for the corresponding RelOptInfo
and have never encountered any problems yet. It limits the use of
gathered data on cardinality/group number/peak memory consumption
somewhat, but not significantly.

I think the node extension fields and hooks, debated in the thread [2],
may be enough to enable extensions to implement such a feature.

What's more, I personally prefer to have a hook that allows extension
checking of a condition during execution - it may be done inside the
ExecProcNode() by calling the node-specific hook, which an extension may
initialise in the PlanState structure before the start of execution.
Additionally, it would be beneficial to have a hook for error processing
at the top of the portal execution code - this is a key point for
managing query resources. If we need to stop and re-execute the query,
it is the only place where we can release all the resources assigned.
One more helpful thing - an option to postpone receiver sending data out
of the instance-side, even result headings. We may want to decide on the
query plan after some tuples are produced, and if something has already
been sent to the user, we can't just stop and rebuild the plan.

[1] https://postgrespro.com/docs/enterprise/16/realtime-query-replanning
[2]
https://www.postgresql.org/message-id/flat/CA%2BTgmoYxfg90rw13%2BJcYwn4dwSC%2Bagw7o8-A%2BfA3M0fh96pg8w%40mail.gmail.com

--
regards, Andrei Lepikhov

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Bertrand Drouvot 2025-08-27 09:46:55 Re: Per backend relation statistics tracking
Previous Message Tomas Vondra 2025-08-27 09:39:35 Re: Changing the state of data checksums in a running cluster