Re: Making Vars outer-join aware

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, "Finnerty, Jim" <jfinnert(at)amazon(dot)com>
Subject: Re: Making Vars outer-join aware
Date: 2022-12-28 08:49:23
Message-ID: CAMbWs4_JDDdXvEBWtTao9ZTwOGdDCU7X-=bF7yi9G1u_-b0Dfw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 27, 2022 at 11:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> The thing that I couldn't get around before is that if you have,
> say, a mergejoinable equality clause in an outer join:
>
> select ... from a left join b on a.x = b.y;
>
> that equality clause can only be associated with the join domain
> for B, because it certainly can't be enforced against A. However,
> you'd still wish to be able to do a mergejoin using indexes on
> a.x and b.y, and this means that we have to understand the ordering
> induced by a PathKey based on this EC as applicable to A, even
> though that relation is not in the same join domain. So there are
> situations where sort orderings apply across domain boundaries even
> though equalities don't. We might have to split the notion of
> EquivalenceClass into two sorts of objects, and somewhere right
> about here is where I realized that this wasn't getting finished
> for v16 :-(.

I think I see where the problem is. And I can see currently in
get_eclass_for_sort_expr we always use the top JoinDomain. So although
the equality clause 'a.x = b.y' belongs to JoinDomain {B}, we set up ECs
for 'a.x' and 'b.y' that belong to the top JoinDomain {A, B, A/B}.

But doing so would lead to a situation where the "same" Vars from
different join domains might have the same varnullingrels and thus would
match by equal(). As an example, consider

select ... from a left join b on a.x = b.y where a.x = 1;

As said we would set up EC for 'b.y' as belonging to the top JoinDomain.
Then when reconsider_outer_join_clause generates the equality clause
'b.y = 1', we figure out that the new clause belongs to JoinDomain {B}.
Note that the two 'b.y' here belong to different join domains but they
have the same varnullingrels (empty varnullingrels actually). As a
result, the equality 'b.y = 1' would be merged into the existing EC for
'b.y', because the two 'b.y' matches by equal() and we do not check
JoinDomain for non-const EC members. So we would end up with an EC
containing EC members of different join domains.

And it seems this would make the following statement in README not hold
any more.

We don't have to worry about this for Vars (or expressions
containing Vars), because references to the "same" column from
different join domains will have different varnullingrels and thus
won't be equal() anyway.

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jelte Fennema 2022-12-28 09:11:05 Re: [PATCH] Support using "all" for the db user in pg_ident.conf
Previous Message Amit Langote 2022-12-28 07:31:23 Re: SQL/JSON revisited