Re: plan shape work

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Richard Guo <guofenglinux(at)gmail(dot)com>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "bruce(at)momjian(dot)us" <bruce(at)momjian(dot)us>, lepihov(at)gmail(dot)com
Subject: Re: plan shape work
Date: 2025-09-15 16:32:15
Message-ID: CA+TgmobYm2k9X5BZUoch15S092kt+CPcbRKaeDAPFovEjcg_Mw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the review. Comments to which I don't respond below are duly noted.

On Sun, Sep 14, 2025 at 7:42 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> * I think your placement of the show_result_replacement_info call
> site suffers from add-at-the-end syndrome. It should certainly go
> before the show_instrumentation_count call: IMO we expect stuff
> added by EXPLAIN ANALYZE to appear after stuff that's there in plain
> EXPLAIN. But I'd really argue that from a user's standpoint this
> information is part of the fundamental plan structure and so it
> deserves more prominence. I'd lean to putting it first in the
> T_Result case, before the "One-Time Filter". (Thought experiment:
> if we'd had this EXPLAIN field from day one, where do you think
> it would have been placed?)

Yes, I wondered if it should actually look more like this:

Degenerate Scan on blah
Degenerate Join on blah, blah, blah
Degenerate Aggregate
MinMaxAggregate Result

So getting rid of Replaces: altogether. In fact, "Replaces" is really
a complete misnomer in the case of a MinMaxAggregate; I'm just not
sure exactly what to do instead.

> + case RESULT_TYPE_UPPER:
> + /* a small white lie */
> + replacement_type = "Aggregate";
> + break;
>
> I find this unconvincing: is it really an aggregate? It doesn't
> help that this case doesn't seem to be reached anywhere in the
> regression tests.

This is the case I know about where that can be reached.

robert.haas=# explain select 1 as a, 2 as b having false;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.01 rows=1 width=8)
One-Time Filter: false
Replaces: Aggregate
(3 rows)

> In general I suspect that we'll have to refine RESULT_TYPE_UPPER
> in the future. I don't think this fear needs to block committing
> of what you have though.

+1.

> + * (Arguably, we should instead display the RTE name in some other way in
> + * such cases, but in typical cases the RTE name is *RESULT* and printing
> + * "Result on *RESULT*" or similar doesn't seem especially useful, so for
> + * now we don't print anything at all.)
>
> Right offhand, I think that RTE_RESULT *always* has the name *RESULT*,
> so the "typical" bit seems misleading. Personally I'd drop this
> para altogether.

Counterexample:

robert.haas=# explain verbose select * from pgbench_accounts where 0 = 1;
QUERY PLAN
------------------------------------------
Result (cost=0.00..0.00 rows=0 width=0)
Output: aid, bid, abalance, filler
One-Time Filter: false
Replaces: Scan on pgbench_accounts
(4 rows)

debug_print_plan says:

:alias <>
:eref
{ALIAS
:aliasname pgbench_accounts
:colnames ("aid" "bid" "abalance" "filler")
}

> In general, I wonder if it'd be better for the callers of
> make_xxx_result to pass in the result_type to use. Those
> functions have such a narrow view of the available info
> that I'm dubious that they can get it right. Especially
> if/when we decide that RESULT_TYPE_UPPER needs subdivision.

That was my first thought, but after experimentation I think it sucks,
especially because of this:

/*
* The only path for it is a trivial Result path. We cheat a
* bit here by using a GroupResultPath, because that way we
* can just jam the quals into it without preprocessing them.
* (But, if you hold your head at the right angle, a FROM-less
* SELECT is a kind of degenerate-grouping case, so it's not
* that much of a cheat.)
*/

I would argue that if you hold your head at that angle, you need your
head examined. Perhaps that wasn't the case when this comment was
written, but from the viewpoint of this project, I think it's pretty
clear. What this does is use create_group_result_path() not only for
actual degenerate-grouping cases (where we're replacing an aggregate)
but also for no-FROM-clause cases (where we're replacing a scan). We
could adjust things so that the no-FROM-clause case doesn't take this
code path, or passes down a flag, but it's clear from examination of
the RelOptInfo. Likewise, for scans vs. joins, peaking at the
reloptkind is a very easy way to tell whether we've got a baserel or a
joinrel and having the caller go to extra trouble to pass it down just
seems silly.

> + * relids identifies the relation for which this Result node is generating the
> + * tuples. When subplan is not NULL, it should be empty: this node is not
> + * generating anything in that case, just acting on tuples generated by the
> + * subplan. Otherwise, it may contain a single RTI (as when this Result node
> + * is substituted for a scan); multiple RTIs (as when this Result node is
> + * substituted for a join); or no RTIs at all (as when this Result node is
> + * substituted for an upper rel).
>
> I doubt this claim that the relid set will be empty for an upper rel.
> I think it's more likely that it will include all the rels for the
> query.

Upper rels are created by fetch_upper_rel(). The third argument
becomes the relids set. Most call sites pass that argument as NULL:

[robert.haas pgsql]$ git grep fetch_upper_rel src/backend/optimizer/
src/backend/optimizer/path/allpaths.c: sub_final_rel =
fetch_upper_rel(rel->subroot, UPPERREL_FINAL, NULL);
src/backend/optimizer/path/costsize.c: sub_final_rel =
fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
src/backend/optimizer/plan/planagg.c: grouped_rel =
fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
src/backend/optimizer/plan/planner.c: final_rel =
fetch_upper_rel(root, UPPERREL_FINAL, NULL);
src/backend/optimizer/plan/planner.c: final_rel =
fetch_upper_rel(root, UPPERREL_FINAL, NULL);
src/backend/optimizer/plan/planner.c: final_rel =
fetch_upper_rel(root, UPPERREL_FINAL, NULL);
src/backend/optimizer/plan/planner.c: grouped_rel =
fetch_upper_rel(root, UPPERREL_GROUP_AGG,
src/backend/optimizer/plan/planner.c: grouped_rel =
fetch_upper_rel(root, UPPERREL_GROUP_AGG, NULL);
src/backend/optimizer/plan/planner.c: window_rel =
fetch_upper_rel(root, UPPERREL_WINDOW, NULL);
src/backend/optimizer/plan/planner.c: distinct_rel =
fetch_upper_rel(root, UPPERREL_DISTINCT, NULL);
src/backend/optimizer/plan/planner.c: partial_distinct_rel =
fetch_upper_rel(root, UPPERREL_PARTIAL_DISTINCT,
src/backend/optimizer/plan/planner.c: ordered_rel =
fetch_upper_rel(root, UPPERREL_ORDERED, NULL);
src/backend/optimizer/plan/planner.c: partially_grouped_rel =
fetch_upper_rel(root,
src/backend/optimizer/plan/setrefs.c:
IS_DUMMY_REL(fetch_upper_rel(rel->subroot,
src/backend/optimizer/plan/subselect.c: final_rel =
fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
src/backend/optimizer/plan/subselect.c: final_rel =
fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
src/backend/optimizer/plan/subselect.c: final_rel =
fetch_upper_rel(subroot, UPPERREL_FINAL, NULL);
src/backend/optimizer/prep/prepunion.c: result_rel =
fetch_upper_rel(root, UPPERREL_SETOP,
src/backend/optimizer/prep/prepunion.c: final_rel =
fetch_upper_rel(rel->subroot, UPPERREL_FINAL, NULL);
src/backend/optimizer/prep/prepunion.c: result_rel =
fetch_upper_rel(root, UPPERREL_SETOP, relids);
src/backend/optimizer/prep/prepunion.c: result_rel =
fetch_upper_rel(root, UPPERREL_SETOP,
src/backend/optimizer/util/relnode.c: * fetch_upper_rel
src/backend/optimizer/util/relnode.c:fetch_upper_rel(PlannerInfo
*root, UpperRelationKind kind, Relids relids)

The exceptions are: make_grouping_rel() passes a relids set when
IS_OTHER_REL(input_rel); create_partial_grouping_paths() passes a
relids set when creating UPPERREL_PARTIAL_GROUP_AGG; and
generate_union_paths() and generate_nonunion_paths() in prepunion.c
bubble up the underlying relids. AFAICS, a non-parallel,
non-partitionwise aggregate ends up with an empty relid set, and even
those cases end up with an empty relid set for the topmost grouping
rel, even if they create some other upper rels that do have non-empty
relid sets.

--
Robert Haas
EDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2025-09-15 16:45:09 Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Previous Message Dmitry Mityugov 2025-09-15 16:23:30 Re: --with-llvm on 32-bit platforms?