| From: | Robert Haas <robertmhaas(at)gmail(dot)com> |
|---|---|
| To: | Greg Burd <greg(at)burd(dot)me> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: pg_plan_advice |
| Date: | 2025-12-09 19:34:43 |
| Message-ID: | CA+TgmoYWLkNMhH1Uiyww7KysiHtFfPLr+yJrp5-CLoiD+g_+gg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Mon, Dec 8, 2025 at 3:39 PM Greg Burd <greg(at)burd(dot)me> wrote:
> Thanks for working on this! I think the idea has merit, I hope it lands sometime soon.
Thanks. I think it needs a good deal more review first, but I
appreciate the support.
> > Attachments:
> > * v5-0001-Store-information-about-range-table-flattening-in.patch
>
> contrib/pg_overexplain/pg_overexplain.c
>
> + /* Advance to next SubRTInfo, if it's time. */
> + if (lc_subrtinfo != NULL)
> + {
> + next_rtinfo = lfirst(lc_subrtinfo);
> + if (rti > next_rtinfo->rtoffset)
>
> Should the test be >= not >? Unless I am I reading this wrong, when rti == rtoffset, that's the first entry of the new subplan's range table. That would mean that the current logic skips displaying the subplan name for the first RTE of each subplan.
I don't think so. I think I actually had it that way at one point, and
I believe I found that it was wrong. RTIs are 1-based, so the smallest
per-subquery RTI is 1. rtoffset is the amount that must be added to
the per-subquery RTI to get a "flat" RTI that can be used to index
into the final range table. But if you find that theoretical argument
unconvincing, by all means please test it and see what happens!
> > This commit teaches pg_overexplain'e RANGE_TABLE option to make use
> Minor nit in the commit message, "pg_overexplain'e" should be "pg_overexplain's"
Thanks, fixed in my local branch.
> > * v5-0002-Store-information-about-elided-nodes-in-the-final.patch
>
> +/*
> + * Record some details about a node removed from the plan during setrefs
> + * procesing, for the benefit of code trying to reconstruct planner decisions
> + * from examination of the final plan tree.
> + */
>
> Nit, "procesing" should be "processing"
Thanks, fixed in my local branch.
> > * v5-0003-Store-information-about-Append-node-consolidation.patch
>
> src/backend/optimizer/path/allpaths.c
>
> /* Now consider each interesting sort ordering */
> foreach(lcp, all_child_pathkeys)
> {
> List *subpaths = NIL;
> bool subpaths_valid = true;
> + List *subpath_cars = NIL;
> List *startup_subpaths = NIL;
> bool startup_subpaths_valid = true;
> + List *startup_subpath_cars = NIL;
> List *partial_subpaths = NIL;
> + List *partial_subpath_cars = NIL;
> List *pa_partial_subpaths = NIL;
> List *pa_nonpartial_subpaths = NIL;
> + List *pa_subpath_cars = NIL;
>
> I find "cars" a bit cryptic (albeit clever), I think I've decoded it properly and it stands for "child_append_relid_sets", correct? Could you add a comment or use a clearer name like subpath_child_relids or consolidated_relid_sets?
I certainly admit that this is a bit too clever. I am not entirely
sure how to make it less clever. There needs to be a
child-append-relid-sets list corresponding to every current and future
subpath list, and the names of some of those subpath lists are already
quite long, so whatever naming convention we choose for the "cars"
lists had better not add too much more length to the variable name. I
felt like someone looking at this might initially be confused by what
"cars" meant, but then I thought that they would probably look at how
the variable was used and see that it was for example being passed as
the second argument to get_singleton_append_subpath(), which is named
child_append_relid_sets, or being passed to create_append_path or
create_merge_append_path, which also use that naming. I figured that
this would clear up the confusion pretty quickly. I could certainly
add a comment above this block of variable assignments saying
something like "for each list of paths, we must also maintain a list
of child append relid sets, etc. etc." but I worried that this would
create as much confusion as it solved, i.e. somebody reading the code
would be going: why is this comment here? Is it trying to tell me that
there's something weirder going on than what is anyway obvious?
If I get more opinions that some clarification is needed here, I'm
happy to change it, especially if those opinions agree with each other
on exactly what to change, but I think for now I'll leave it as it is.
> +accumulate_append_subpath(Path *path, List **subpaths, List **special_subpaths,
> + List **child_append_relid_sets)
> {
> if (IsA(path, AppendPath))
> {
> @@ -2219,6 +2256,8 @@ accumulate_append_subpath(Path *path, List **subpaths, List **special_subpaths)
> if (!apath->path.parallel_aware || apath->first_partial_path == 0)
> {
> *subpaths = list_concat(*subpaths, apath->subpaths);
> + *child_append_relid_sets =
> + lappend(*child_append_relid_sets, path->parent->relids);
>
> Is it possible that when pulling up multiple subpaths from an AppendPath, only ONE relid set is added to child_append_relid_sets, but MULTIPLE paths are added to subpaths? If so, that would break the correspondence between the lists which would be bad, right?
That would indeed be bad, but I'm not clear on how you think it could
happen. Can you clarify?
> src/include/nodes/pathnodes.h
> + * Whenever accumulate_append_subpath() allows us to consolidate multiple
> + * levels of Append paths are consolidated down to one, we store the RTI
> + * sets for the omitted paths in child_append_relid_sets. This is not necessary
> + * for planning or execution; we do it for the benefit of code that wants
> + * to inspect the final plan and understand how it came to be.
>
> Minor: "paths are consolidated" is redundant, should be "paths consolidated" or "allows us to consolidate".
Thanks, fixed in my local branch.
> > * v5-0004-Allow-for-plugin-control-over-path-generation-str.patch
>
> src/backend/optimizer/path/costsize.c
> + else
> + enable_mask |= PGS_CONSIDER_NONPARTIAL;
>
> - path->disabled_nodes = enable_seqscan ? 0 : 1;
> + path->disabled_nodes =
> + (baserel->pgs_mask & enable_mask) == enable_mask ? 0 : 1;
>
> When parallel_workers > 0 the path is partial and doesn't need PGS_CONSIDER_NONPARTIAL. But if parallel_workers == 0, it's non-partial and DOES need it, right? Would this mean that non-partial paths can be disabled even when the scan type itself (e.g., PGS_SEQSCAN) is enabled? Intentional?
See this comment:
* Finally, unsetting PGS_CONSIDER_NONPARTIAL disables all non-partial paths
* except those that use Gather or Gather Merge. In most other cases, a
* plugin can nudge the planner toward a particular strategy by disabling
* all of the others, but that doesn't work here: unsetting PGS_SEQSCAN,
* for instance, would disable both partial and non-partial sequential scans.
> It seems this is still WIP with a solid start, I'm not going to dig too much into it. :)
>
> Keep it up, best.
Thanks for the review so far!
--
Robert Haas
EDB: http://www.enterprisedb.com
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2025-12-09 19:35:16 | Re: vacuumdb: add --dry-run |
| Previous Message | Melanie Plageman | 2025-12-09 19:31:09 | Re: [PATCH] VACUUM: avoid pre-creation transactions holding back cleanup of newly created relations |