Re: Join push-down support for foreign tables

From: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Join push-down support for foreign tables
Date: 2014-12-16 00:00:53
Message-ID: 9A28C8860F777E439AA12E8AEA7694F80109199D@BPXM15GP.gisp.nec.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hanada-san,

Thanks for proposing this great functionality people waited for.

> On Mon, Dec 15, 2014 at 3:40 AM, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
> wrote:
> > I'm working on $SUBJECT and would like to get comments about the
> > design. Attached patch is for the design below.
>
> I'm glad you are working on this.
>
> > 1. Join source relations
> > As described above, postgres_fdw (and most of SQL-based FDWs) needs to
> > check that 1) all foreign tables in the join belong to a server, and
> > 2) all foreign tables have same checkAsUser.
> > In addition to that, I add extra limitation that both inner/outer
> > should be plain foreign tables, not a result of foreign join. This
> > limiation makes SQL generator simple. Fundamentally it's possible to
> > join even join relations, so N-way join is listed as enhancement item
> > below.
>
> It seems pretty important to me that we have a way to push the entire join
> nest down. Being able to push down a 2-way join but not more seems like
> quite a severe limitation.
>
As long as we don't care about simpleness/gracefulness of the remote
query, what we need to do is not complicated. All the optimization jobs
are responsibility of remote system.

Let me explain my thought:
We have three cases to be considered; (1) a join between foreign tables
that is the supported case, (2) a join either of relations is foreign
join, and (3) a join both of relations are foreign joins.

In case of (1), remote query shall have the following form:
SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>

In case of (2) or (3), because we already construct inner/outer join,
it is not difficult to replace the FT1 or FT2 above by sub-query, like:
SELECT <tlist> FROM FT3 JOIN
(SELECT <tlist> FROM FT1 JOIN FT2 ON <cond> WHERE <qual>) as FJ1
ON <joincond> WHERE <qual>

How about your thought?

Let me comment on some other points at this moment:

* Enhancement in src/include/foreign/fdwapi.h

It seems to me GetForeignJoinPath_function and GetForeignJoinPlan_function
are not used anywhere. Is it an oversight to remove definitions from your
previous work, isn't it?
Now ForeignJoinPath is added on set_join_pathlist_hook, but not callback of
FdwRoutine.

* Is ForeignJoinPath really needed?

I guess the reason why you added ForeignJoinPath is, to have the fields
of inner_path/outer_path. If we want to have paths of underlying relations,
a straightforward way for the concept (join relations is replaced by
foreign-/custom-scan on a result set of remote join) is enhancement of the
fields in ForeignPath.
How about an idea that adds "List *fdw_subpaths" to save the list of
underlying Path nodes. It also allows to have more than two relations
to be joined.
(Probably, it should be a feature of interface portion. I may have to
enhance my portion.)

* Why NestPath is re-defined?

-typedef JoinPath NestPath;
+typedef struct NestPath
+{
+ JoinPath jpath;
+} NestPath;

It looks to me this change makes patch scale larger...

Best regards,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei <kaigai(at)ak(dot)jp(dot)nec(dot)com>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Josh Berkus 2014-12-16 00:14:30 Re: Commitfest problems
Previous Message Amit Langote 2014-12-15 23:55:35 Re: On partitioning