Re: Making Vars outer-join aware

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>
Subject: Re: Making Vars outer-join aware
Date: 2022-08-15 08:48:23
Message-ID: CAMbWs480igxj6hTbvD0WEnzT4XJ8hJXg71EVJO_BcQqN90MQuA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 2, 2022 at 3:51 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> Here's a rebase up to HEAD, mostly to placate the cfbot.
> I accounted for d8e34fa7a (s/all_baserels/all_query_rels/
> in those places) and made one tiny bug-fix change.
> Nothing substantive as yet.

When we add required PlaceHolderVars to a join rel's targetlist, if the
PHV can be computed in the nullable-side of the join, we would add the
join's RT index to phnullingrels. This is correct as we know the PHV's
value can be nulled at this join. But I'm wondering if it is necessary
since we have already propagated any varnullingrels into the PHV when we
apply pullup variable replacement in perform_pullup_replace_vars().

On the other hand, the phnullingrels may contain RT indexes of outer
joins above this join level. It seems not good to add such a PHV to the
joinrel's targetlist. Below is an example:

# explain (verbose, costs off) select a.i, ss.jj from a left join (b left
join (select c.i, coalesce(c.j, 1) as jj from c) ss on b.i = ss.i) on true;
QUERY PLAN
---------------------------------------------------------
Nested Loop Left Join
Output: a.i, (COALESCE(c.j, 1))
-> Seq Scan on public.a
Output: a.i, a.j
-> Materialize
Output: (COALESCE(c.j, 1))
-> Hash Left Join
Output: (COALESCE(c.j, 1))
Hash Cond: (b.i = c.i)
-> Seq Scan on public.b
Output: b.i, b.j
-> Hash
Output: c.i, (COALESCE(c.j, 1))
-> Seq Scan on public.c
Output: c.i, COALESCE(c.j, 1)
(15 rows)

In this query, for the joinrel {B, C}, the PHV in its targetlist has a
phnullingrels that contains the join of {A} and {BC}, which is confusing
because we have not reached that join level.

I tried the changes below to illustrate the two issues. The assertion
holds true during regression tests and the error pops up for the query
above.

--- a/src/backend/optimizer/util/placeholder.c
+++ b/src/backend/optimizer/util/placeholder.c
@@ -464,18 +464,20 @@ add_placeholders_to_joinrel(PlannerInfo *root,
RelOptInfo *joinrel,
{
if (sjinfo->jointype == JOIN_FULL &&
sjinfo->ojrelid != 0)
{
- /* PHV's value can be nulled at this
join */
- phv->phnullingrels =
bms_add_member(phv->phnullingrels,
-
sjinfo->ojrelid);
+
Assert(bms_is_member(sjinfo->ojrelid, phv->phnullingrels));
+
+ if
(!bms_is_subset(phv->phnullingrels, joinrel->relids))
+ elog(ERROR, "phnullingrels
is not subset of joinrel's relids");
}
}
else if (bms_is_subset(phinfo->ph_eval_at,
inner_rel->relids))
{
if (sjinfo->jointype != JOIN_INNER &&
sjinfo->ojrelid != 0)
{
- /* PHV's value can be nulled at this
join */
- phv->phnullingrels =
bms_add_member(phv->phnullingrels,
-
sjinfo->ojrelid);
+
Assert(bms_is_member(sjinfo->ojrelid, phv->phnullingrels));
+
+ if
(!bms_is_subset(phv->phnullingrels, joinrel->relids))
+ elog(ERROR, "phnullingrels
is not subset of joinrel's relids");
}
}

If the two issues are indeed something we need to fix, maybe we can
change add_placeholders_to_joinrel() to search the PHVs from
outer_rel/inner_rel's targetlist, and add the ojrelid to phnullingrels
if needed, just like what we do in build_joinrel_tlist(). The PHVs there
should have the correct phnullingrels (at least the PHVs in base rels'
targetlists have correctly set phnullingrels to NULL).

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2022-08-15 10:48:22 Re: Cleaning up historical portability baggage
Previous Message Peter Eisentraut 2022-08-15 08:41:42 Remove remaining mentions of UNSAFE_STAT_OK