Re: Push down more full joins in postgres_fdw

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Push down more full joins in postgres_fdw
Date: 2016-11-28 10:22:03
Message-ID: c6a68ef1-3e97-807c-b381-087d109739eb@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2016/11/24 21:45, Etsuro Fujita wrote:
> On 2016/11/22 18:28, Ashutosh Bapat wrote:
>> The comments should explain why is the assertion true.
>> + /* Shouldn't be NIL */
>> + Assert(tlist != NIL);

> I noticed that I was wrong; in the Assertion the tlist can be empty. An
> example for such a case is:
>
> SELECT 1 FROM (SELECT c1 FROM ft4 WHERE c1 between 50 and 60) t1 FULL
> JOIN (SELECT c1 FROM ft5 WHERE c1 between 50 and 60) t2 ON (TRUE);
>
> In this example any columns of the relations ft4 and ft5 wouldn't be
> needed for the join or the final output, so both the tlists for the
> relations created in deparseRangeTblRef would be empty. Attached is an
> updated version fixing this bug. Changes are:
>
> * I removed make_outerrel_subquery/make_innerrel_subquery from fpinfo
> and added a new member "is_subquery_rel" to fpinfo, as a flag on whether
> to deparse the relation as a subquery.
> * I modified deparseRangeTblRef, deparseSelectSql, and is_subquery_var
> to see the is_subquery_rel, not
> make_outerrel_subquery/make_innerrel_subquery.
> * I modified appendSubselectAlias to handle the case where ncols = 0
> properly.
> * I renamed subquery_rels in fpinfo to lower_subquery_rels, to make it
> easy to distinguish this from is_subquery_rel clearly.
> * I added regression tests for that.

I noticed that the above changes are not adequate; the updated patch
breaks EXPLAIN outputs for EvalPlanQual subplans of foreign joins in
which foreign tables are full joined and in the remote join query the
foreign tables are deparsed as subqueries. Consider:

postgres=# explain verbose select * from test, ((select * from ft1 where
b between 1 and 3) t1 full join (select * from ft2 where b between 1 and
3) t2
on (t1.a = t2.a)) ss for update of test;


QUERY PLA
N
-------------------------------------------------------------------------------------------------------------------------------------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
LockRows (cost=100.00..287.33 rows=6780 width=94)
Output: test.a, test.b, ft1.a, ft1.b, ft2.a, ft2.b, test.ctid,
ft1.*, ft2.*
-> Nested Loop (cost=100.00..219.53 rows=6780 width=94)
Output: test.a, test.b, ft1.a, ft1.b, ft2.a, ft2.b, test.ctid,
ft1.*, ft2.*
-> Seq Scan on public.test (cost=0.00..32.60 rows=2260 width=14)
Output: test.a, test.b, test.ctid
-> Materialize (cost=100.00..102.19 rows=3 width=80)
Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.*
-> Foreign Scan (cost=100.00..102.17 rows=3 width=80)
Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.*
Relations: (public.ft1) FULL JOIN (public.ft2)
Remote SQL: SELECT s5.c1, s5.c2, s5.c3, s6.c1,
s6.c2, s6.c3 FROM ((SELECT a, b, ROW(a, b) FROM public.t1 WHERE ((b >=
1)) AND ((b
<= 3))) s5(c1, c2, c3) FULL JOIN (SELECT a, b, ROW(a, b) FROM public.t2
WHERE ((b >= 1)) AND ((b <= 3))) s6(c1, c2, c3) ON (((s5.c1 = s6.c1))))
-> Hash Full Join (cost=201.14..202.29 rows=3
width=80)
Output: ft1.a, ft1.b, ft1.*, ft2.a, ft2.b, ft2.*
Hash Cond: (ft1.a = ft2.a)
-> Foreign Scan on public.ft1
(cost=100.00..101.11 rows=3 width=40)
Output: ft1.a, ft1.b, ft1.*
Remote SQL: SELECT NULL FROM public.t1
WHERE ((b >= 1)) AND ((b <= 3))
-> Hash (cost=101.11..101.11 rows=3 width=40)
Output: ft2.a, ft2.b, ft2.*
-> Foreign Scan on public.ft2
(cost=100.00..101.11 rows=3 width=40)
Output: ft2.a, ft2.b, ft2.*
Remote SQL: SELECT NULL FROM
public.t2 WHERE ((b >= 1)) AND ((b <= 3))
(23 rows)

The SELECT clauses of the remote queries for the Foreign Scans in the
EPQ subplan are deparsed incorrectly into "SELECT NULL". The reason for
that is that since the foreign tables' fpinfo->is_subquery_rel has been
set true at the time those clauses are deparsed (due to that the foreign
tables have been deparsed as subqueries in the main remote join query),
deparseSelectSql calls deparseExplicitTargetList with an empty tlist, to
construct the clauses, so it deparses the tlist into "NULL". This is
harmless because we don't use the remote queries for the Foreign Scans
during an EPQ recheck, but that would be confusing for users, so I
modified the patch a bit further to make the EXPLAIN output as-is.
Attached is a new version of the patch.

> I also revised code and comments a bit, just for consistency.

Also, I renamed appendSubselectAlias to appendSubqueryAlias, because we
use the term "subquery" for other places.

> I will update another patch for PHVs on top of the attached one.

Done. I also revised some code and comments. I'm also attaching the
updated version of the patch for PHVs.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
postgres-fdw-subquery-support-v12.patch text/x-patch 41.6 KB
postgres-fdw-phv-pushdown-v12.patch text/x-patch 73.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2016-11-28 10:22:47 Re: asynchronous execution
Previous Message Amit Langote 2016-11-28 10:18:20 Fix comment in build_simple_rel