Re: Support "Right Semi Join" plan shapes

From: wenhui qiu <qiuwenhuifx(at)gmail(dot)com>
To: guofenglinux(at)gmail(dot)com
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Support "Right Semi Join" plan shapes
Date: 2023-12-28 03:02:45
Message-ID: CAGjGUA+tO2sHMSTy57LvdwDTiiYn6fif2DB5KidD-92iLAgX+w@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Richard Guo
I did a simple test ,Subqueries of type (in) can be supported, There
is a test sql that doesn't support it, and I think that's because it can't
pull up the subqueries.
```
test=# explain (costs off) SELECT t1.* FROM prt1_adv t1 WHERE EXISTS
(SELECT 1 FROM prt2_adv t2 WHERE t1.a = t2.b) AND t1.b = 0 ORDER BY t1.a;
QUERY PLAN
------------------------------------------------------
Sort
Sort Key: t1.a
-> Hash Right Semi Join
Hash Cond: (t2.b = t1.a)
-> Append
-> Seq Scan on prt2_adv_p1 t2_1
-> Seq Scan on prt2_adv_p2 t2_2
-> Seq Scan on prt2_adv_p3 t2_3
-> Hash
-> Append
-> Seq Scan on prt1_adv_p1 t1_1
Filter: (b = 0)
-> Seq Scan on prt1_adv_p2 t1_2
Filter: (b = 0)
-> Seq Scan on prt1_adv_p3 t1_3
Filter: (b = 0)
(16 rows)

test=# explain (costs off) SELECT t1.* FROM prt1_adv t1 WHERE t1.a IN
(SELECT t2.b FROM prt2_adv t2) AND t1.b = 0 ORDER BY t1.a;
QUERY PLAN
------------------------------------------------------
Sort
Sort Key: t1.a
-> Hash Right Semi Join
Hash Cond: (t2.b = t1.a)
-> Append
-> Seq Scan on prt2_adv_p1 t2_1
-> Seq Scan on prt2_adv_p2 t2_2
-> Seq Scan on prt2_adv_p3 t2_3
-> Hash
-> Append
-> Seq Scan on prt1_adv_p1 t1_1
Filter: (b = 0)
-> Seq Scan on prt1_adv_p2 t1_2
Filter: (b = 0)
-> Seq Scan on prt1_adv_p3 t1_3
Filter: (b = 0)
(16 rows)

test=#

test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE EXISTS
(SELECT 1 FROM plt2_adv t2 WHERE t1.a = t2.a AND t1.c = t2.c) AND t1.b < 10
ORDER BY t1.a;
QUERY PLAN
------------------------------------------------------
Sort
Sort Key: t1.a
-> Hash Right Semi Join
Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
-> Append
-> Seq Scan on plt2_adv_p1 t2_1
-> Seq Scan on plt2_adv_p2 t2_2
-> Seq Scan on plt2_adv_p3 t2_3
-> Hash
-> Append
-> Seq Scan on plt1_adv_p1 t1_1
Filter: (b < 10)
-> Seq Scan on plt1_adv_p2 t1_2
Filter: (b < 10)
-> Seq Scan on plt1_adv_p3 t1_3
Filter: (b < 10)
(16 rows)

test=#
test=# explain (costs off) SELECT t1.* FROM plt1_adv t1 WHERE (t1.a, t1.c)
IN (SELECT t2.a, t2.c FROM plt2_adv t2) AND t1.b < 10 ORDER BY t1.a;
QUERY PLAN
------------------------------------------------------
Sort
Sort Key: t1.a
-> Hash Right Semi Join
Hash Cond: ((t2.a = t1.a) AND (t2.c = t1.c))
-> Append
-> Seq Scan on plt2_adv_p1 t2_1
-> Seq Scan on plt2_adv_p2 t2_2
-> Seq Scan on plt2_adv_p3 t2_3
-> Hash
-> Append
-> Seq Scan on plt1_adv_p1 t1_1
Filter: (b < 10)
-> Seq Scan on plt1_adv_p2 t1_2
Filter: (b < 10)
-> Seq Scan on plt1_adv_p3 t1_3
Filter: (b < 10)
(16 rows)

```

```
test=# explain (costs off) select * from int4_tbl i4, tenk1 a
where exists(select * from tenk1 b
where a.twothousand = b.twothousand and a.fivethous <>
b.fivethous)
and i4.f1 = a.tenthous;
QUERY PLAN
-------------------------------------------------
Hash Right Semi Join
Hash Cond: (b.twothousand = a.twothousand)
Join Filter: (a.fivethous <> b.fivethous)
-> Seq Scan on tenk1 b
-> Hash
-> Hash Join
Hash Cond: (a.tenthous = i4.f1)
-> Seq Scan on tenk1 a
-> Hash
-> Seq Scan on int4_tbl i4
(10 rows)

test=# explain (costs off ) SELECT *
FROM int4_tbl i4, tenk1 a
WHERE (a.twothousand, a.fivethous) IN (
SELECT b.twothousand, b.fivethous
FROM tenk1 b
WHERE a.twothousand = b.twothousand and a.fivethous <> b.fivethous
)
AND i4.f1 = a.tenthous;
QUERY PLAN
----------------------------------------------------------------------------------------
Nested Loop
Join Filter: (i4.f1 = a.tenthous)
-> Seq Scan on tenk1 a
Filter: (SubPlan 1)
SubPlan 1
-> Seq Scan on tenk1 b
Filter: ((a.fivethous <> fivethous) AND (a.twothousand =
twothousand))
-> Materialize
-> Seq Scan on int4_tbl i4
(9 rows)
test=# set enable_nestloop =off;
SET
test=# explain (costs off ) SELECT *
FROM int4_tbl i4, tenk1 a
WHERE (a.twothousand, a.fivethous) IN (
SELECT b.twothousand, b.fivethous
FROM tenk1 b
WHERE a.twothousand = b.twothousand and a.fivethous <> b.fivethous
)
AND i4.f1 = a.tenthous;
QUERY PLAN
----------------------------------------------------------------------------------------
Hash Join
Hash Cond: (a.tenthous = i4.f1)
-> Seq Scan on tenk1 a
Filter: (SubPlan 1)
SubPlan 1
-> Seq Scan on tenk1 b
Filter: ((a.fivethous <> fivethous) AND (a.twothousand =
twothousand))
-> Hash
-> Seq Scan on int4_tbl i4
(9 rows)

```

wenhui qiu <qiuwenhuifx(at)gmail(dot)com> 于2023年12月15日周五 14:40写道:

> Hi Richard Guo I see that the test samples are all (exists)
> subqueries ,I think semi join should also support ( in) and ( any)
> subqueries. would you do more test on ( in) and ( any) subqueries?
>
>
> Best whish
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Hayato Kuroda (Fujitsu) 2023-12-28 03:08:19 RE: Random pg_upgrade test failure on drongo
Previous Message Corey Huinker 2023-12-28 02:49:23 Re: Statistics Import and Export