Re: Question about pull_up_sublinks_qual_recurse

From: Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Question about pull_up_sublinks_qual_recurse
Date: 2022-10-17 13:43:58
Message-ID: CAKU4AWp2TtVh+CnUFTMFQ-zHvttouXGfcvJovaKH4Q6dwjmPKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Tom:

Thanks for your reply! I have self reviewed the below message at 3 different
time periods to prevent from too inaccurate replies. It may be more detailed
than it really needed, but it probably can show where I am lost.

On Sat, Oct 15, 2022 at 3:27 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

>
> If the pulled-up join doesn't go into the nullable side of the upper
> join then you've changed semantics. In this case, it'd amount to

reassociating a semijoin that was within the righthand side of another
> semijoin to above that other semijoin.

I understand your reply as:

select * from t1 *left join* t2 on exists (select 1 from t3 where t3.a =
t2.a);
= select * from t1 *left join* (t2 semi join t3 on t3.a = t2.a) on true;
-- go to nullable side
!= select * from (t1 *left join* t2 on true) semi join t3 on (t3.a =
t2.a); -- go to above the JoinExpr

I CAN follow the above. And for this case it is controlled by below code:

pull_up_sublinks_qual_recurse

switch (j->jointype)
{
case JOIN_INNER:
...
case JOIN_LEFT:
j->quals = pull_up_sublinks_qual_recurse(root, j->quals,

&j->rarg,

rightrelids,

NULL, NULL);
break;
...
}

and I didn't change this. My question is could we assume

A *semijoin* B ON EXISTS (SELECT 1 FROM C on (Pbc))
= (A *semijoin* (B *semijoin* C on (Pbc))) on TRUE. (current master did)
= (A *semijoin* B ON true) *semijoin* C on (Pbc) (my current
thinking)

Note that there is no 'left outer join' at this place. Since there are too
many places called pull_up_sublinks_qual_recurse, to make things
less confused, I prepared a patch for this one line change to show where
exactly I changed (see patch 2); I think this is the first place I lost.

The discussion of outer join
> reordering in optimizer/README says that that doesn't work,

I think you are talking about the graph "Valid OUTER JOIN Optimizations".
I can follow until below.

"
SEMI joins work a little bit differently. A semijoin can be reassociated
into or out of the lefthand side of another semijoin, left join, or
antijoin, but not into or out of the righthand side. ..
"
I am unclear why
(A semijoin B on (Pab)) semijoin C on (Pbc)
!= A semijoin (B semijoin C on (Pbc)) on (Pab);

Seems both return rows from A which match both semijoin (Pab) and
(Pbc). or I misunderstand the above words in the first place?

At last, when I checked optimizer/README, it looks like we used
a 'nullable side' while it should be 'nonnullable side'? see patch 1
for details.

--
Best Regards
Andy Fan

Attachment Content-Type Size
v1-0001-a-typo-error-I-think-it-should-be-nonnullable-sid.patch application/x-patch 938 bytes
v1-0002-This-patch-doesn-t-mean-I-insist-on-this-at-all-i.patch application/x-patch 1.7 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matthias van de Meent 2022-10-17 14:00:07 Re: PATCH: Using BRIN indexes for sorted output
Previous Message Robert Haas 2022-10-17 13:15:35 Re: thinko in basic_archive.c