WIP: Join push-down for foreign tables

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: WIP: Join push-down for foreign tables
Date: 2011-09-14 09:24:25
Message-ID: 4E7072C9.10508@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi all,

I'd like to propose $SUBJECT for further foreign query optimization.
I've not finished development, but I'd appreciate it if I got someone's
review on my WIP code and its design.

Changes I made
==============

(1) Add foreign server OID to RelOptInfo
I think it would be nice to know whether a join comes from one foreign
server or not without digging into child nodes during considering paths
for a query. So I added serverid field to RelOptInfo, which defaults to
InvalidOid ,and is set to OID of the server if the node and all of its
children are from same foreign server. This also avoids looking catalog
up for foreign table entry to determine FDW routine.

(2) Add new planner node, ForeignJoinPath
ForeignJoinPath derives JoinPath, like other Join nodes, and holds
FdwPlan like ForeignPath node.

This node is used to represent a pushed-down join between foreign tables
and/or another foreign join in early planning phase, for all of
combination such as table-table, table-join, join-table and join-join
will be considered. In addition, though they might generate same
internal (FDW-specific) plan, reversed combination is considered;
planner generates two ForeignJoinPath for both (A & B) and (B & A).

During creation of this node, planner calls new FDW handler function
PlanForeignJoin to get a FdwPlan which includes costs and internal plan
of a foreign join. If a FDW can't (or doesn't want to) handle this
join, just return NULL is OK, and then planner gives such optimization
up and considers other usual join methods such as nested loop and hash join.

A subtree which has a ForeignJoin on its top is translated into a
ForeignScan node during constructing a plan tree. This behavior is
different from other join path nodes such as NestPath and MergePath,
because they have child plan nodes correspond to path nodes.

(3) Add EXPALIN support for foreign join (currently just for debug)
ForeignScan might not be a simple foreign table scan, so
ExplainScanTarget() can't be used for it. An idea I have is adding
ExplainForeignScanTarget() to handle ForeignScan separately from other
scan nodes.

(4) Add new GUC parameter, enable_foreignjoin
If this was off, planner never generates ForeignJoinPath. In such case,
foreign tables will be joined with one of NestLoop, MergeJoin and HashJoin.

Known issue
===========

I'm sorry but attached patch, join_pushdown_v1.patch, is WIP, so
currently some kind of query fails. Known failure patterns are:

*) SELECT * FROM A JOIN B (...) doesn't work. Specifying columns in
SELECT clause explicitly like "SELECT A.col1, A.col2, ..." seems to work.
*) ORDER BY causes error if no column is specified in SELECT clause from
sort key's table.

Probably more problems still are there...

PG-wrapper as sample implementation
===================================

pgsql_fdw-0.1.0.tar.gz is an WIP implementation of PG-wrapper, which can
(hopefully) handle both simple foreign table scan and multiple foreign
joins. You can build it with placing in contrib/, or using pgxs. Note
that it has some issues such as memory leak of PGresult. I'm planning
to propose this wrapper as a contrib module, but it would be after
clearing such issues.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
join_pushdown_v1.patch text/plain 32.1 KB
pgsql_fdw-0.1.0.tar.gz application/gzip 77.5 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Eisentraut 2011-09-14 09:33:15 Re: unite recovery.conf and postgresql.conf
Previous Message Kyotaro HORIGUCHI 2011-09-14 09:21:08 [REVIEW] pg_last_xact_insert_timestamp