From: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Richard Guo <guofenglinux(at)gmail(dot)com>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "bruce(at)momjian(dot)us" <bruce(at)momjian(dot)us>, lepihov(at)gmail(dot)com |
Subject: | Re: plan shape work |
Date: | 2025-09-24 22:03:19 |
Message-ID: | 3641043.1758751399@sss.pgh.pa.us |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Robert Haas <robertmhaas(at)gmail(dot)com> writes:
> Here's a new patch set. 0004 is now 0001 and similarly all other patch
> numbers are -3, since the old 0001 and 0002 were committed together
> and the 0003 is abandoned. I made the following changes to
> old-0004/new-0001:
I've not looked at all of these patches, but here's a review of v9-0001.
> - I rewrote the commit message. I'm not really sure this is any
> clearer about the motivation for this patch, but I tried. Suggestions
> appreciated.
It's much better, thanks.
> - You (Tom) complained about the lack of const on
> sublinktype_to_string, so this version has been const-ified. The const
> bled into the arguments to choose_plan_name() and subquery_planner(),
> and into the plan_name structure members within PlannerInfo and
> SubPlan. I don't know if this is the right thing to do, so feel free
> to set me straight.
I don't think so. We do not have a nice story on marking Node fields
const: it's very unclear for example what consequences that ought to
have for copyObject(). Maybe somebody will tackle that issue someday,
but it's not something to touch casually in a patch with other
objectives. So I don't think we can make the plan_name fields const.
The best solution I think is to make choose_plan_name() take a const
string and return a non-const one. The attached v10-0001 is just like
your v9-0001 except for doing the const stuff this way. I chose to
fix the impedance mismatch within choose_plan_name() by having it
pstrdup when it wants to just return the "name" string, but you could
make a case for holding your nose and just casting away const there.
> - You (Tom) also asked why not print InitPlan/SubPlan wherever we
> refer to subplans, so this version restores that behavior.
Thanks. I'm good with the output now (modulo the bug described
below). Someone could potentially argue that this exposes more
of the internals than we really ought to, such as the difference
between expr and multiexpr SubLinks, but I'm okay with that.
Aside from the const issue, something I don't really like at the
coding level is the use of an "allroots" list. One reason is that
it's partially redundant with the adjacent "subroots" list, but
a bigger one is that we have transient roots that shouldn't be
in there. An example here is pull_up_simple_subquery: it builds
a clone of the query's PlannerInfo to help it use various
infrastructure along the way to flattening the subquery, but
that clone is not referenced anymore after the function exits.
You were putting that into allroots, which seems to me to be
a fundamental error, even more so because it went in with the
same plan_name as the root it was cloned from.
I think a better idea is to keep a list of just the subplan
names that we've assigned so far. That has a far clearer
charter, plus it can be updated immediately by choose_plan_name()
instead of relying on the caller to do the right thing later.
I coded this up, and was rather surprised to find that it changed
some regression outputs. On investigation, that's because
build_minmax_path() was actually doing the wrong thing later:
it was putting the wrong root into allroots, so that "minmax_1"
never became assigned and could be re-used later.
I also observed that SS_process_ctes() was not using
choose_plan_name() but simply assigning the user-written CTE
name. I believe it's possible to use the same CTE name in
different parts of a query tree, so this fails to achieve
the stated purpose of making the names unique.
I'm still a little bit uncomfortable about whether
it's okay for pull_up_simple_subquery() to just do
+ subroot->plan_name = root->plan_name;
rather than giving some other name to the transient subroot.
I think it's okay because we are not making any meaningful planning
decisions during the life of the subroot, just seeing if we can
transform the subquery into a form that allows it to be pulled up.
But you might think differently. Perhaps a potential compromise
is to set the transient subroot's plan_name to NULL instead?
Anyway, v10-0002 is a delta patch to use a list of subplan
names instead of "allroots", and there are a couple of trivial
cosmetic changes too.
regards, tom lane
Attachment | Content-Type | Size |
---|---|---|
v10-0001-Assign-each-subquery-a-unique-name-prior-to-plan.patch | text/x-diff | 164.0 KB |
v10-0002-Use-a-list-of-subplan-names-instead-of-a-list-of.patch | text/x-diff | 10.1 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Tomas Vondra | 2025-09-24 23:02:28 | Re: Fix overflow of nbatch |
Previous Message | Andrew Kim | 2025-09-24 21:50:44 | Re: Proposal for enabling auto-vectorization for checksum calculations |