Re: Using each rel as both outer and inner for JOIN_ANTI

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Emre Hasegeli <emre(at)hasegeli(dot)com>
Subject: Re: Using each rel as both outer and inner for JOIN_ANTI
Date: 2023-04-10 09:21:08
Message-ID: CAMbWs4_eChX1bN=vj0Uzg_7iz9Uivan+Wjjor-X87L-V27A+rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 7, 2023 at 3:28 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:

> On Tue, Aug 2, 2022 at 3:13 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:
>
>> On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>
>>> [ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]
>>
>> Maybe this is something we can do. Currently for the query below:
>>
>> # explain select * from foo where a in (select c from bar);
>> QUERY PLAN
>> -------------------------------------------------------------------------
>> Hash Semi Join (cost=154156.00..173691.29 rows=10 width=8)
>> Hash Cond: (foo.a = bar.c)
>> -> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8)
>> -> Hash (cost=72124.00..72124.00 rows=5000000 width=4)
>> -> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=4)
>> (5 rows)
>>
>> I believe we can get a cheaper plan if we are able to swap the outer and
>> inner for SEMI JOIN and use the smaller 'foo' as inner rel.
>>
> It may not be easy for MergeJoin and NestLoop though, as we do not have
> a way to know if an inner tuple has been already matched or not. But
> the benefit of swapping inputs for MergeJoin and NestLoop seems to be
> small, so I think it's OK to ignore them.
>

Hmm. Actually we can do it for MergeJoin by avoiding restoring inner
scan to the marked tuple in EXEC_MJ_TESTOUTER, in the case when new
outer tuple == marked tuple. But I'm not sure how much benefit we can
get from Merge Right Semi Join.

For HashJoin, though, there are cases that can surely benefit from Hash
Right Semi Join. So I go ahead and have a try on it as attached.

Thanks
Richard

Attachment Content-Type Size
v1-0001-Support-Right-Semi-Join.patch application/octet-stream 23.3 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Zhijie Hou (Fujitsu) 2023-04-10 09:46:05 RE: Support logical replication of DDLs
Previous Message Hayato Kuroda (Fujitsu) 2023-04-10 09:18:46 RE: [PoC] pg_upgrade: allow to upgrade publisher node