Re: WIP: Join push-down for foreign tables

From: Heikki Linnakangas <heikki(dot)linnakangas(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: WIP: Join push-down for foreign tables
Date: 2011-12-02 13:01:06
Message-ID: 4ED8CC12.6000700@enterprisedb.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17.11.2011 17:24, Tom Lane wrote:
> Heikki Linnakangas<heikki(dot)linnakangas(at)enterprisedb(dot)com> writes:
>> When the FDW recognizes it's being asked to join a ForeignJoinPath and a
>> ForeignPath, or two ForeignJoinPaths, it throws away the old SQL it
>> constructed to do the two-way join, and builds a new one to join all
>> three tables.
>
> It should certainly not "throw away" the old SQL --- that path could
> still be chosen.

Right, that was loose phrasing from me.

>> That seems tedious, when there are a lot of tables
>> involved. A FDW like the pgsql_fdw that constructs an SQL query doesn't
>> need to consider pairs of joins. It could just as well build the SQL for
>> the three-way join directly. I think the API needs to reflect that.

Tom, what do you think of this part? I think it would be a lot more
natural API if the planner could directly ask the FDW to construct a
plan for a three (or more)-way join, instead of asking it to join a join
relation into another relation.

The proposed API is this:

+ FdwPlan *
+ PlanForeignJoin (Oid serverid,
+ PlannerInfo *root,
+ RelOptInfo *joinrel,
+ JoinType jointype,
+ SpecialJoinInfo *sjinfo,
+ Path *outer_path,
+ Path *inner_path,
+ List *restrict_clauses,
+ List *pathkeys);

The problem I have with this is that the FDW shouldn't need outer_path
and inner_path. All the information it needs is in 'joinrel'. Except for
outer-joins, I guess; is there convenient way to get the join types
involved in a join rel? It's there in SpecialJoinInfo, but if the FDW is
only passed the RelOptInfo representing the three-way join, it's not there.

Does the planner expect the result from the foreign server to be
correctly sorted, if it passes pathkeys to that function?

>> I wonder if we should have a heuristic to not even consider doing a join
>> locally, if it can be done remotely.
>
> I think this is a bad idea. It will require major restructuring of the
> planner, and sometimes it will fail to find the best plan, in return for
> not much. The nature of join planning is that we investigate a lot of
> dead ends.

Ok.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2011-12-02 13:33:30 Re: Inlining comparators as a performance optimisation
Previous Message Kohei KaiGai 2011-12-02 11:52:41 Re: Prep object creation hooks, and related sepgsql updates