Re: plan shape work

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "bruce(at)momjian(dot)us" <bruce(at)momjian(dot)us>, lepihov(at)gmail(dot)com
Subject: Re: plan shape work
Date: 2025-09-10 07:16:29
Message-ID: CAMbWs49MzvVF+he-rpU3rY=Pdbz_iLDi=wrdWmH6geVS2H5Sxg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 9, 2025 at 10:18 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Sep 8, 2025 at 10:22 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > One idea (not fully thought through) is that we record the calculated
> > outerjoin_relids for each outer join in its JoinPaths. (We cannot
> > store this in the joinrel's RelOptInfo because it varies depending on
> > the join sequence we use.) And then we could use the recorded
> > outerjoin_relids for the assertion here:
> >
> > outer_relids U inner_relids U joinpath->ojrelids == joinrel->relids

> I'm OK with moving the conversation to a separate thread, but can you
> clarify from where you believe that joinpath->ojrelids would be
> populated? It seems to me that the assertion couldn't pass unless
> every join path ended up with the same value of joinpath->ojrelids.
> That's because, for a given joinrel, there is only one value of
> joinrel->relids; and all of those RTIs must be either RTE_JOIN or
> non-RTE_JOIN. The non-RTE_JOIN RTIs will be found only in outer_relids
> U inner_relids, and the RTE_JOIN RTIs will be found only in
> joinpath->ojrelids. Therefore, it seems impossible for the assertion
> to pass unless the value is the same for all join paths.

Hmm, this isn't quite what I had in mind. What I was thinking is that
the outer join relids included in joinrel->relids can also be found
from its outer or inner. For example, consider a query like:

(A leftjoin B on (Pab)) leftjoin C on (Pbc)

For the join with joinrel->relids being {1, 2, 3, 4, 5}, {1, 2, 3}
comes from the outer side, {4} comes from the inner side, and {5} is
the outer join being calculated at this join. So the Assert I
proposed earlier becomes:

{1, 2, 3} U {4} U {5} == {1, 2, 3, 4, 5}

However, if we have transformed it to:

A leftjoin (B leftjoin C on (Pbc)) on (Pab)

For this same join, {1} comes from the outer side, {2, 4} comes from
the inner side, and {3, 5} are the outer joins being calculated at
this join. So the Assert becomes:

{1} U {2, 4} U {3, 5} == {1, 2, 3, 4, 5}

Either way, the assertion should always hold -- if it doesn't, there's
likely a bug in how we're calculating the relids.

As you can see, the set of outer joins calculated at the same join can
vary depending on the join order. What I suggested is to record this
information in JoinPaths (or maybe also in Join plan nodes so that
get_scanned_rtindexes can collect it) for the assertion.

- Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dilip Kumar 2025-09-10 07:18:21 Re: Incorrect logic in XLogNeedsFlush()
Previous Message Shubham Khanna 2025-09-10 07:12:53 Re: Add support for specifying tables in pg_createsubscriber.