Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Thom Brown <thom(at)linux(dot)com>, "pgsql-hackers(at)postgreSQL(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Custom/Foreign-Join-APIs (Re: [v9.5] Custom Plan API)
Date: 2015-04-03 10:32:08
Message-ID: CAEZqfEci52HP1BLZBsc=UwPUQvg357CsJhWrTm+N+VGW6kiN3g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached is the patch which adds join push-down support to postgres_fdw
(v7). It supports SELECT statements with JOIN, but has some more possible
enhancements (see below). I'd like to share the WIP patch here to get
comments about new FDW API design provided by KaiGai-san's v11 patch.

To make reviewing easier, I summarized changes against Custom/Foreign join
v11 patch.

Changes for Join push-down support
==================================
- Add FDW API GetForeignJoinPaths(). It generates a ForeignPath which
represents a scan against pseudo join relation represented by given
RelOptInfo.
- Expand deparsing module to handle multi-relation queries. Steps of
deparsing a join query:

1) Optimizer calls postgresGetForeignPaths() for each BASEREL. Here
postgres_fdw does the same things as before, except adding column aliases
in SELECT clause.
2) Optimizer calls postgresGetForeignJoinPaths() for each JOINREL.
Optimizer calls once per RelOptInfo with reloptkind == RELOPT_JOINREL, so
postgres_fdw should consider both A JOIN B and B JOIN A in one call.

postgres_fdw checks whether the join can be pushed down.

a) Both outer and inner relations can be pushed down (NULL in
RelOptInfo#fdw_private indicates such situation)
b) Outmost command is a SELECT (this can be relaxed in the future)
c) Join type is inner or one of outer
d) Server of all relations in the join are identical
e) Effective user id for all relations in the join are identical (they
might be different some were accessed via views)
f) No local filters (this can be relaxed if inner && non-volatile)
g) Join conditions doesn't contain any "unsafe" expression
h) Remote filter doesn't contain any "unsafe" expression

If all criteria passed, postgres_fdw makes ForeignPath for the join and
store these information in its fdw_private.

a) ForeignPath of outer relation, first non-parameterized one
b) ForeignPath of outer relation, first non-parameterized one
c) join type (as integer)
d) join conditions (as List of Expr)
e) other conditions (as List of Expr)

As of now the costs of the path is not so accurate, this is a possible
enhancement.

2) Optimizer calls postgresGetForeignPlan() for the cheapest topmost Path.
If foreign join is the cheapest way to execute the query, optimizer calls
postgresGetForeignPlan for the topmost path generated by
postgresGetForeignJoinPaths. As Robert and Tom mentioned in the thread,
large_table JOIN huge_table might be removed even (large_table JOIN
huge_table) JOIN small_table is the cheapest in the join level 3, so
postgres_fdw can't assume that paths in lower level survived planning.

To cope with the situation, I'm trying to avoid calling create_plan_recurse()
for underlying paths by putting necessary information into
PgFdwRelationInfo and link it to appropriate RelOptInfo.

Anyway in current version query string is built up from bottom (BASEREL) to
upper recursively. For a join, unerlying outer/inner query are put into
FROM clause with wrapping with parenthesis and aliasing. For example:

select * from pgbench_branches b join pgbench_tellers t on t.bid = b.bid;

is transformed to a query like this:

SELECT l.a1, l.a2, l.a3, r.a1, r.a2, r.a3, r.a4 FROM (SELECT bid a9,
bbalance a10, filler a11 FROM public.pgbench_branches) l (a1, a2, a3) INNER
JOIN (SELECT tid a9, bid a10, balance a11, filler a12 FROM
public.pgbench_tellers)
r (a1, a2, a3, a4) ON ((l.a1 = r.a2));

As in the remote query, column aliasing uses attnum-based numbering with
shifted by FirstLowInvalidHeapAttributeNumber to make all attnum positive.
For instance, this system uses alias "a9" for the first user column. For
readability of code around this, I introduced TO_RELATEVE() macro which
converts absolute attnum (-8~) to relative ones (0~). Current deparser can
also handle whole-row references (attnum == 0) correctly.

3) Executor calls BeginForeignScan to initialize a scan. Here TupleDesc is
taken from the slot, not Relation.

Possible enhancement
====================
- Make deparseSelectSql() more general, thus it can handle both simple
SELECT and join SELECT by calling itself recursively. This would avoid
assuming that underlying ForeignPath remains in RelOptInfo. (WIP)
- Move appendConditions() calls into deparse.c, to clarify responsibility
of modules.
- more accurate estimation
- more detailed information for error location (currently "foreign table"
is used as relation name always)

Attachment Content-Type Size
foreign_join_v7.patch application/octet-stream 145.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2015-04-03 11:33:04 Re: EvalPlanQual behaves oddly for FDW queries involving system columns
Previous Message Fujii Masao 2015-04-03 09:35:49 Re: The return value of allocate_recordbuf()