Re: pgsql_fdw in contrib

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql_fdw in contrib
Date: 2012-06-26 13:50:12
Message-ID: CADyhKSXs3d-2mvM=5c_GpehQd_nULL-GbGggLVeVKNn5R6odjA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Harada-san,

I checked your patch, and had an impression that includes many
improvements from the previous revision that I looked at the last
commit fest.

However, I noticed several points to be revised, or investigated.

* It seems to me expected results of the regression test is not
attached, even though test cases were included. Please add it.

* cleanup_connection() does not close the connection in case
when this callback was invoked due to an error under sub-
transaction. It probably makes problematic behavior.

See the following steps to reproduce the matter:

postgres=# BEGIN;
BEGIN
postgres=# SELECT * FROM ft3;
a | b
---+-----
1 | aaa
2 | bbb
3 | ccc
4 | ddd
5 | eee
6 | fff
(6 rows)

postgres=# SAVEPOINT sv_01;
SAVEPOINT

postgres=# SELECT * FROM ft3 WHERE 1 / (a - 4) > 0; -- should be
error on remote transaction
ERROR: could not execute remote query
DETAIL: ERROR: division by zero

HINT: SELECT a, b FROM public.t1 WHERE (((1 OPERATOR(pg_catalog./) (a
OPERATOR(pg_catalog.-) 4)) OPERATOR(pg_catalog.>) 0))

postgres=# ROLLBACK TO SAVEPOINT sv_01;
ROLLBACK

postgres=# SELECT * FROM ft3;
ERROR: could not execute EXPLAIN for cost estimation
DETAIL: ERROR: current transaction is aborted, commands ignored
until end of transaction block

HINT: SELECT a, b FROM public.t1

Once we got an error under the remote transaction, it eventually raises
an error on local side, then cleanup_connection() should be invoked.
But it ignores the error due to sub-transaction, thus, the remote transaction
being already aborted is kept.
I'd like to suggest two remedy.
1. connections are closed, even if errors on sub-transaction.
2. close the connection if PQexecParams() returns an error,
on execute_query() prior to raise a local error.

* Regarding to deparseSimpleSql(), it pulls attributes being referenced
from baserestrictinfo and reltargetlist using pull_var_clause().
Is it unavailable to use local_conds instead of baserestrictinfo?
We can optimize references to the variable being consumed at the
remote side only. All we need to transfer is variables referenced
in targetlist and local-clauses.
In addition, is pull_var_clause() reasonable to list up all the attribute
referenced at the both expression tree? It seems to be pull_varattnos()
is more useful API in this situation.

* Regarding to deparseRelation(), it scans simple_rte_array to fetch
RangeTblEntry with relation-id of the target foreign table.
It does not match correct entry if same foreign table is appeared in
this list twice or more, like a case of self-join. I'd like to recommend
to use simple_rte_array[baserel->relid] instead.
In addition, it checks whether relkind is RELKIND_FOREIGN_TABLE,
or not. It seems to me this check should be Assert(), if routines of
pgsql_fdw is called towards regular relations.

* Regarding to deparseVar(), is it unnecessary to check relkind of
the relation being referenced by Var node, isn't it?

* Regarding to deparseBoolExpr(), compiler raised a warning
because op can be used without initialized.

* Even though it is harmless, sortConditions() is a misleading function
name. How about categolizeConditions() or screeningConditions()?

Thanks for your great jobs.

2012/6/14 Shigeru HANADA <shigeru(dot)hanada(at)gmail(dot)com>:
> I'd like to propose pgsql_fdw, FDW for PostgreSQL, as a contrib module
> in core, again.  This patch is basically rebased version of the patches
> proposed in 9.2 development cycle, and contains some additional changes
> such as concern about volatile variables for PG_CATCH blocks.  In
> addition, I combined old three patches (pgsql_fdw core functionality,
> push-down support, and analyze support) into one patch for ease of
> review. (I think these features are necessary for pgsql_fdw use.)
>
> After the development for 9.2 cycle, pgsql_fdw can gather statistics
> from remote side by providing sampling function to analyze module in
> backend core, use them to estimate selectivity of WHERE clauses which
> will be pushed-down, and retrieve query results from remote side with
> custom row processor which prevents memory exhaustion from huge result set.
>
> It would be possible to add some more features, such as ORDER BY
> push-down with index information support, without changing existing
> APIs, but at first add relatively simple pgsql_fdw and enhance it seems
> better.  In addition, once pgsql_fdw has been merged, it would help
> implementing proof-of-concept of SQL/MED-related features.
>
> Regards,
> --
> Shigeru HANADA
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2012-06-26 14:01:26 Re: [PATCH 01/16] Overhaul walsender wakeup handling
Previous Message Robert Haas 2012-06-26 13:48:37 Re: Catalog/Metadata consistency during changeset extraction from wal