Re: Making Vars outer-join aware

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>
Subject: Re: Making Vars outer-join aware
Date: 2022-08-16 19:24:04
Message-ID: 1405792.1660677844@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I wrote:
> Richard Guo <guofenglinux(at)gmail(dot)com> writes:
>> If the two issues are indeed something we need to fix, maybe we can
>> change add_placeholders_to_joinrel() to search the PHVs from
>> outer_rel/inner_rel's targetlist, and add the ojrelid to phnullingrels
>> if needed, just like what we do in build_joinrel_tlist(). The PHVs there
>> should have the correct phnullingrels (at least the PHVs in base rels'
>> targetlists have correctly set phnullingrels to NULL).

> Yeah, maybe we should do something more invasive and make use of the
> input targetlists rather than doing this from scratch. Not sure.
> I'm worried that doing it that way would increase the risk of getting
> different join tlist contents depending on which pair of input rels
> we happen to consider first.

After chewing on that for awhile, I've concluded that that is the way
to go. 0001 attached is a standalone patch to rearrange the way that
PHVs are added to joinrel targetlists. It results in one cosmetic
change in the regression tests, where the targetlist order for an
intermediate join node changes. I think that's fine; if anything,
the new order is more sensible than the old because it matches the
inputs' targetlist orders better.

I believe the reason I didn't do it like this to start with is that
I was concerned about the cost of searching the placeholder_list
repeatedly. With a lot of PHVs, build_joinrel_tlist becomes O(N^2)
just from the repeated find_placeholder_info lookups. We can fix
that by adding an index array to go straight from phid to the
PlaceHolderInfo. While thinking about where to construct such
an index array, I decided it'd be a good idea to have an explicit
step to "freeze" the set of PlaceHolderInfos, at the start of
deconstruct_jointree. This allows getting rid of the create_new_ph
flags for find_placeholder_info and add_vars_to_targetlist, which
I've always feared were bugs waiting to happen: they require callers
to have a very clear understanding of when they're invoked. There
might be some speed gain over existing code too, but I've not really
tried to measure it. I did drop a couple of hacks that were only
meant to short-circuit find_placeholder_info calls; that function
should now be cheap enough to not matter.

Barring objections, I'd like to push these soon and then rebase
the main outer-join-vars patch set over them.

regards, tom lane

Attachment Content-Type Size
0001-rearrange-joinrel-PHV-handling.patch text/x-diff 5.9 KB
0002-speed-up-PHI-lookups.patch text/x-diff 16.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message i.lazarev 2022-08-16 19:36:27 Re: MultiXact\SLRU buffers configuration
Previous Message Robert Haas 2022-08-16 19:04:57 Re: XLogBeginRead's header comment lies