From: | Richard Guo <guofenglinux(at)gmail(dot)com> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "bruce(at)momjian(dot)us" <bruce(at)momjian(dot)us>, lepihov(at)gmail(dot)com |
Subject: | Re: plan shape work |
Date: | 2025-09-09 02:22:40 |
Message-ID: | CAMbWs4_Ps-OFeNuWXF=oDXEvEq_t5KDTFUg=s7y9ZFMZ+qs2OA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Sep 8, 2025 at 10:56 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Mon, Sep 8, 2025 at 5:51 AM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
> > BTW, I'm wondering if we can take outer join relids into account in
> > assert_join_preserves_scan_rtis(), which could make the check more
> > useful. A joinrel's relids consists of three parts: the outer plan's
> > relids, the inner plan's relids, and the relids of outer joins that
> > are calculated at this join. We already have the first two. If we
> > can find a way to determine the third, we'd be able to assert that:
> >
> > outer_relids U inner_relids U outerjoin_relids == joinrel->relids
> >
> > Determining the third part can be tricky though, especially due to
> > outer-join identity 3: the "outerjoin_relids" of one outer join might
> > include more than one outer join relids. But I think this is till
> > doable.
> >
> > (This may not be useful for your overall goal in this patchset, so
> > feel free to ignore it if it's not of interest.)
> I don't mind doing the work if there's a reasonable and useful way of
> accomplishing the goal. However, one concern I have is that it seems
> pointless if we're computing outerjoin_relids by essentially redoing
> the same computation that set the join's relid set in the first place.
> In that case, the cross-check has no real probative value. All it
> would be demonstrating is that if you calculate outerjoin_relids twice
> using essentially the same methodology, you get the same answer. That
> seems like a waste of code to me. If there's a way to calculate
> outerjoin_relids using a different methodology than what we used when
> populating the joinrelids, that would be interesting. It would be
> similar to how the existing code recomputes the outer and inner relids
> in a way that can potentially find issues that otherwise would not
> have been spotted (such as the Result node case).
>
> Do you have a proposal?
One idea (not fully thought through) is that we record the calculated
outerjoin_relids for each outer join in its JoinPaths. (We cannot
store this in the joinrel's RelOptInfo because it varies depending on
the join sequence we use.) And then we could use the recorded
outerjoin_relids for the assertion here:
outer_relids U inner_relids U joinpath->ojrelids == joinrel->relids
The value of this approach, IMO, is that it could help verify the
correctness of how we compute outer joins' outerjoin_relids, ie. the
logic in add_outer_joins_to_relids(), which is quite complex due to
outer-join identity 3. If we miscalculate the outerjoin_relids for
one certain outer join, this assertion could catch it effectively.
However, this shouldn't be a requirement for committing your patches.
Maybe we should discuss it in a separate thread.
- Richard
From | Date | Subject | |
---|---|---|---|
Next Message | Michael Paquier | 2025-09-09 02:29:04 | Re: PgStat_HashKey padding issue when passed by reference |
Previous Message | Sami Imseih | 2025-09-09 02:20:14 | Re: PgStat_HashKey padding issue when passed by reference |