Re: Using each rel as both outer and inner for JOIN_ANTI

From: Richard Guo <guofenglinux(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Ronan Dunklau <ronan(dot)dunklau(at)aiven(dot)io>, Pg Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Emre Hasegeli <emre(at)hasegeli(dot)com>
Subject: Re: Using each rel as both outer and inner for JOIN_ANTI
Date: 2023-04-07 07:28:46
Message-ID: CAMbWs4_p=hMF=yJRLWxUT6H-4igLWSN0onSnxzgqKHZBqt_1AA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Aug 2, 2022 at 3:13 PM Richard Guo <guofenglinux(at)gmail(dot)com> wrote:

> On Sun, Jul 31, 2022 at 12:07 AM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>
>> [ wanders away wondering if JOIN_RIGHT_SEMI should become a thing ... ]
>
> Maybe this is something we can do. Currently for the query below:
>
> # explain select * from foo where a in (select c from bar);
> QUERY PLAN
> -------------------------------------------------------------------------
> Hash Semi Join (cost=154156.00..173691.29 rows=10 width=8)
> Hash Cond: (foo.a = bar.c)
> -> Seq Scan on foo (cost=0.00..1.10 rows=10 width=8)
> -> Hash (cost=72124.00..72124.00 rows=5000000 width=4)
> -> Seq Scan on bar (cost=0.00..72124.00 rows=5000000 width=4)
> (5 rows)
>
> I believe we can get a cheaper plan if we are able to swap the outer and
> inner for SEMI JOIN and use the smaller 'foo' as inner rel.
>

I'm thinking about the JOIN_RIGHT_SEMI thing and it seems that it can be
implemented for HashJoin with very short change. What we want to do is
to just have the first match for each inner tuple. So after scanning
the hash bucket for matches, we just need to check whether the inner
tuple has been set match and skip it if so, something like

{
if (!ExecScanHashBucket(node, econtext))
{
/* out of matches; check for possible outer-join fill */
node->hj_JoinState = HJ_FILL_OUTER_TUPLE;
continue;
}
}

+ /*
+ * In a right-semijoin, we only need the first match for each
+ * inner tuple.
+ */
+ if (node->js.jointype == JOIN_RIGHT_SEMI &&
+ HeapTupleHeaderHasMatch(HJTUPLE_MINTUPLE(node->hj_CurTuple)))
+ continue;
+

I have a simple implementation locally and tried it with the query below
and saw a speedup of 2055.617 ms VS. 1156.772 ms (both best of 3).

# explain (costs off, analyze)
select * from foo where a in (select c from bar);
QUERY PLAN
-------------------------------------------------------------------------------
Hash Semi Join (actual time=1957.748..2055.058 rows=10 loops=1)
Hash Cond: (foo.a = bar.c)
-> Seq Scan on foo (actual time=0.026..0.029 rows=10 loops=1)
-> Hash (actual time=1938.818..1938.819 rows=5000000 loops=1)
Buckets: 262144 Batches: 64 Memory Usage: 4802kB
-> Seq Scan on bar (actual time=0.016..853.010 rows=5000000
loops=1)
Planning Time: 0.327 ms
Execution Time: 2055.617 ms
(8 rows)

# explain (costs off, analyze)
select * from foo where a in (select c from bar);
QUERY PLAN
-------------------------------------------------------------------------
Hash Right Semi Join (actual time=11.525..1156.713 rows=10 loops=1)
Hash Cond: (bar.c = foo.a)
-> Seq Scan on bar (actual time=0.034..523.036 rows=5000000 loops=1)
-> Hash (actual time=0.027..0.029 rows=10 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 9kB
-> Seq Scan on foo (actual time=0.009..0.014 rows=10 loops=1)
Planning Time: 0.312 ms
Execution Time: 1156.772 ms
(8 rows)

It may not be easy for MergeJoin and NestLoop though, as we do not have
a way to know if an inner tuple has been already matched or not. But
the benefit of swapping inputs for MergeJoin and NestLoop seems to be
small, so I think it's OK to ignore them.

So is it worthwhile to make JOIN_RIGHT_SEMI come true?

Thanks
Richard

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2023-04-07 07:41:27 Re: CREATE SUBSCRIPTION -- add missing tab-completes
Previous Message Masahiko Sawada 2023-04-07 06:52:49 Re: Should vacuum process config file reload more often