Re: Join push-down for foreign tables

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Join push-down for foreign tables
Date: 2011-08-30 11:45:48
Message-ID: 4E5CCD6C.1020902@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks for the comments.

(2011/08/30 1:42), Tom Lane wrote:
>> Costs of ForeignJoinPath are estimated by FDW via new routine
>> PlanForeignJoin, and SQL based FDW would need to generate remote SQL
>> here. If a FDW can't push down that join, then it can set disable_cost
>> (1.0e10) to tell planner to not choose that path.
>
> disable_cost is not a positive guarantee that a path won't be chosen.
> Particularly not for foreign table accesses, where the estimated costs
> could be pretty darn large in themselves. You need to pick an API
> wherein refusal is unmistakable. Probably, returning NULL instead of a
> Path structure is the appropriate way to signal "can't do this join".

Agreed. Returning NULL seems fine.

>> In this design, cost of ForeignJoinPath is compared to other join nodes
>> such as NestPath and MergePath. If ForeignJoinPath is the cheapest one
>> among the join candidates, planner will generates ForeignJoin plan node
>> and put it into plan tree as a leaf node. In other words, joined
>> foreign tables are merged into upper ForeignJoin node.
>
> Hmmm ... are you trying to describe what happens when three or more
> foreign tables are all to be joined at the remote end?

Yes, that's what I wanted to say :)

> I agree that's
> an important use-case, and that we probably want just one Plan node to
> result from it, but I'm less sure about what the Path representation
> ought to be. It might be better to retain the Path tree showing what
> we'd concluded about what the join order ought to be, with the idea that
> the transmitted query could be constructed to reflect that, saving the
> remote-end planner from having to repeat that work.

It seems a fine solution. Somehow I thought that one path node should
be mapped to one plan node. In fact, merge join path node might be
expanded to multiple plan nodes, through it's reversed case of foreign
join. I'm going to implement this idea, and hopefully post proof patch
for next CF.

BTW, Is adding foreign server oid to RelOptInfo acceptable? This field
is set in build_simple_rel() or build_join_rel() if the RelOptInfo
itself is a foreign scan, or it is a foreign join and both inner and
outer RelOptInfo have same and valid foreign server oid. I think that
this field could avoid recursive search into foreign join subtree.

Regards,
--
Shigeru Hanada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Weiss, Wilfried 2011-08-30 11:58:09 postgesql-9.0.4 compile on AIX 6.1 using gcc 4.4.6
Previous Message 权宗亮 2011-08-30 11:11:45 compile from git repository