Re: pgsql_fdw, FDW for PostgreSQL server

From: Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Kevin Grittner <Kevin(dot)Grittner(at)wicourts(dot)gov>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Date: 2012-03-26 09:46:20
Message-ID: 4F703AEC.5000100@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2012/03/15 23:06), Shigeru HANADA wrote:
> Although the patches are still WIP, especially in WHERE push-down part,
> but I'd like to post them so that I can get feedback of the design as
> soon as possible.

I've implemented pgsql_fdw's own deparser and enhanced some features
since last post. Please apply attached patches in the order below:

* pgsql_fdw_v17.patch
- Adds pgsql_fdw as contrib module
* pgsql_fdw_pushdown_v10.patch
- Adds WHERE push down capability to pgsql_fdw
* pgsql_fdw_analyze_v1.patch
- Adds pgsql_fdw_analyze function for updating local stats

Changes from previous version
=============================

1) Don't use remote EXPLAIN for cost/rows estimation, so now planner
estimates result rows and costs on the basis of local statistics such as
pg_class and pg_statistic. To update local statistics, I added
pgsql_fdw_analyze() SQL function which updates local statistics of a
foreign table by retrieving remote statistics, such as pg_class and
pg_statistic, via libpq. This would make the planning of pgsql_fdw
simple and fast. This function can be easily modified to handle ANALYZE
command invoked for a foreign table (Fujita-san is proposing this as
common feature in another thread).

2) Defer planning stuffs as long as possible to clarify the role of each
function. Currently GetRelSize just estimates result rows from local
statistics, and GetPaths adds only one path which represents SeqScan on
remote side. As result of this change, PgsqlFdwPlanState struct is
obsolete.

3) Implement pgsql_fdw's own deparser which pushes down collation-free
and immutable expressions in local WHERE clause. This means that most
of numeric conditions can be pushed down, but conditions using character
types are not.

Most of nodes are deparsed in straightforward way, but OpExpr is not.
OpExpr is deparsed with OPERATOR() notation to specify operator's
schema explicitly. This would prevent us from possible search_path problem.

[local query]
WHERE -col = -1
[remote query]
WHERE ((OPERATOR(pg_class.-) col) OPERATOR(pg_class.=) 1)

4) Pushed down quals are not evaluated on local side again. When
creating ForeignScan node for chosen best path, pushed down expressions
are removed from "qpqual" parameter of make_foreignscan, so a qualifier
is evaluated only once at local or remote.

5) EXPLAIN on pgsql_fdw foreign tables show simple SELECT statement.
DECLARE statement including cursor name is still available in VERBOSE
mode. (I feel that showing DECLARE always is little noisy...)

Regards,
--
Shigeru HANADA

Attachment Content-Type Size
pgsql_fdw_v17.patch text/plain 108.5 KB
pgsql_fdw_pushdown_v10.patch text/plain 51.3 KB
pgsql_fdw_analyze_v1.patch text/plain 30.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-03-26 12:16:51 Re: Command Triggers, v16
Previous Message Kyotaro HORIGUCHI 2012-03-26 08:54:46 Re: Speed dblink using alternate libpq tuple storage