RE: Add semi-join pushdown to postgres_fdw

From: "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp>
To: Alexander Pyhalov <a(dot)pyhalov(at)postgrespro(dot)ru>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Pgsql Hackers <pgsql-hackers(at)postgresql(dot)org>, Ian Lawrence Barwick <barwick(at)gmail(dot)com>, "Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp" <Fujii(dot)Yuki(at)df(dot)MitsubishiElectric(dot)co(dot)jp>
Subject: RE: Add semi-join pushdown to postgres_fdw
Date: 2022-12-03 03:02:02
Message-ID: OS3PR01MB6660A7F36A2D37194DB601F195169@OS3PR01MB6660.jpnprd01.prod.outlook.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Mr.Pyhalov.

Thank you for work on this useful patch.
I'm starting to review v2 patch.
I have cheked we can apply v2 patch to commit ec386948948c1708c0c28c48ef08b9c4dd9d47cc
(Date:Thu Dec 1 12:56:21 2022 +0100).
I briefly looked at this whole thing and did step execute this
by running simple queries such as the followings.

query1) select * from f_t1 a1 where a1.c1 in (select c1 from f_t2);
query2) select * from f_t1 a1 join f_t3 a2 on a1.c1 = a2.c1 where a1.c1 in (select c1 from f_t3) ;
query3) update f_t2 set c1 = 1 from f_t1 a1 where a1.c2 = f_t2.c2 and exists (select null from f_t2 where c1 = a1.c1);

Although I haven't seen all of v2 patch, for now I have the following questions.

question1)
> + if (jointype == JOIN_SEMI && bms_is_member(var->varno, innerrel->relids) && !bms_is_member(var->varno, outerrel->relids))
It takes time for me to find in what case this condition is true.
There is cases in which this condition is true for semi-join of two baserels
when running query which joins more than two relations such as query2 and query3.
Running queries such as query2, you maybe want to pushdown of only semi-join path of
joinrel(outerrel) defined by (f_t1 a1 join f_t3 a2 on a1.c1 = a2.c1) and baserel(innerrel) f_t3
because of safety deparse. So you add this condition.
Becouase of this limitation, your patch can't push down subquery expression
"exists (select null from f_t2 where c1 = a1.c1)" in query3.
I think, it is one of difficulty points for semi-join pushdown.
This is my understanding of the intent of this condition and the restrictions imposed by this condition.
Is my understanding right?
I think if there are comments for the intent of this condition and the restrictions imposed by this condition
then they help PostgreSQL developper. What do you think?

question2) In foreign_join_ok
> * Constructing queries representing ANTI joins is hard, hence
Is this true? Is it hard to expand your approach to ANTI join pushdown?

question3) You use variables whose name is "addl_condXXX" in the following code.
> appendStringInfo(addl_conds, "EXISTS (SELECT NULL FROM %s", join_sql_i.data);
Does this naming mean additional literal?
Is there more complehensive naming, such as "subquery_exprXXX"?

question4) Although really detail, there is expression making space such as
"ft4.c2 = ft2.c2" and one making no space such as "c1=ftupper.c1".
Is there reason for this difference? If not, need we use same policy for making space?

Later, I'm going to look at part of your patch which is used when running more complex query.

Sincerely yours,
Yuuki Fujii

--
Yuuki Fujii
Information Technology R&D Center Mitsubishi Electric Corporation

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2022-12-03 03:06:16 Re: Failed Assert in pgstat_assoc_relation
Previous Message Michael Paquier 2022-12-03 02:45:30 Re: Add LZ4 compression in pg_dump