Re: pgsql_fdw, FDW for PostgreSQL server

From: Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
To: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Hitoshi Harada <umi(dot)tanuki(at)gmail(dot)com>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Date: 2012-02-01 11:15:57
Message-ID: 4F291EED.7000000@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2012/01/30 4:39), Kohei KaiGai wrote:
> I checked the "fdw_helper_funcs_v3.patch", "pgsql_fdw_v5.patch" and
> "pgsql_fdw_pushdown_v1.patch". My comments are below.

Thanks for the review!

> [BUG]
> Even though pgsql_fdw tries to push-down qualifiers being executable
> on the remove side at the deparseSql(), it does not remove qualifiers
> being pushed down from the baserel->baserestrictinfo, thus, these
> qualifiers are eventually executed twice.
>
> See the result of this EXPLAIN.
> postgres=# EXPLAIN SELECT * FROM ft1 WHERE a> 2 AND f_leak(b);
> QUERY PLAN
> ------------------------------------------------------------------------------------------------------
> Foreign Scan on ft1 (cost=107.43..122.55 rows=410 width=36)
> Filter: (f_leak(b) AND (a> 2))
> Remote SQL: DECLARE pgsql_fdw_cursor_0 SCROLL CURSOR FOR SELECT
> a, b FROM public.t1 WHERE (a> 2)
> (3 rows)
>
> My expectation is (a> 2) being executed on the remote-side and f_leak(b)
> being executed on the local-side. But, filter of foreign-scan on ft1 has both
> of qualifiers. It has to be removed, if a RestrictInfo get pushed-down.

It's intentional that pgsql_fdw keeps pushed-down qualifier in
baserestrictinfo, because I saw some undesirable behavior when I
implemented so that with such optimization when plan is reused, but it's
not clear to me now. I'll try to recall what I saw...

BTW, I think evaluating pushed-down qualifiers again on local side is
safe and has no semantic problem, though we must pay little for such
overhead. Do you have concern about performance?

> [Design comment]
> I'm not sure the reason why store_result() uses MessageContext to save
> the Tuplestorestate within PgsqlFdwExecutionState.
> The source code comment says it is used to avoid memory leaks in error
> cases. I also have a similar experience on implementation of my fdw module,
> so, I could understand per-scan context is already cleared at the timing of
> resource-release-callback, thus, handlers to external resource have to be
> saved on separated memory context.
> In my personal opinion, the right design is to declare a memory context for
> pgsql_fdw itself, instead of the abuse of existing memory context.
> (More wise design is to define sub-memory-context for each foreign-scan,
> then, remove the sub-memory-context after release handlers.)

I simply chose built-in context which has enough lifespan, but now I
think that using MessageContext directly is not recommended way. As you
say, creating new context as child of MessageContext for each scan in
BeginForeignScan (or first IterateForeignScan) would be better. Please
see attached patch.

One other option is getting rid of tuplestore by holding result rows as
PGresult, and track it for error cases which might happen.
ResourceOwner callback can be used to release PGresult on error, similar
to PGconn.

> [Design comment]
> When "BEGIN" should be issued on the remote-side?
> The connect_pg_server() is an only chance to issue "BEGIN" command
> at the remote-side on connection being opened. However, the connection
> shall be kept unless an error is not raised. Thus, the remote-side will
> continue to work within a single transaction block, even if local-side iterates
> a pair of "BEGIN" and "COMMIT".
> I'd like to suggest to close the transaction block at the timing of either
> end of the scan, transaction or sub-transaction.

Indeed, remote transactions should be terminated at some timing.
Terminating at the end of a scan seems troublesome because a connection
might be shared by multiple scans in a query. I'd prefer aborting
remote transaction at the end of local query. Please see
abort_remote_tx in attached patch.

> [Comment to Improve]
> Also, which transaction isolation level should be specified in this case?
> An idea is its isolation level is specified according to the current isolation
> level on the local-side.
> (Of course, it is your choice if it is not supported right now.)

Choosing same as local seems better. Please see start_remote_tx
function in attached patch.

> [Comment to improve]
> It seems to me the design of exprFunction is not future-proof, if we add
> a new node type that contains two or more function calls, because it can
> return an oid of functions.
> I think, the right design is to handle individual node-types within the
> large switch statement at foreign_expr_walker().
> Of course, it is just my sense.

You mean that exprFunction should have capability to handle multiple
Oids for one node, maybe return List<oid> or something? IMO it's
overkill at this time.
Though I'm not sure that it's reasonable, but exprInputCollation too
seems to not assume that multiple input collation might be stored in one
node.

> [Comment to improve]
> The pgsql_fdw_handler() allocates FdwRoutine using makeNode(),
> then it set function-pointers on each fields.
> Why don't you return a pointer to statically declared FdwRoutine
> variable being initialized at compile time, like:
>
> static FdwRoutine pgsql_fdw_handler = {
> .type = T_FdwRoutine,
> .PlanForeignScan = pgsqlPlanForeignScan,
> .ExplainForeignScan = pgsqlExplainForeignScan,
> .BeginForeignScan = pgsqlBeginForeignScan,
> .IterateForeignScan = pgsqlIterateForeignScan,
> .ReScanForeignScan = pgsqlReScanForeignScan,
> .EndForeignScan = pgsqlEndForeignScan,
> };
>
> Datum
> pgsql_fdw_handler(PG_FUNCTION_ARGS)
> {
> PG_RETURN_POINTER(&pgsql_fdw_handler);
> }

Fixed to static variable without designated initializers, because it's
one of C99 feature. Is there any written policy about using C99
features in PG development? I've read Developer's FAQ (wiki) and code
formatting section of document, but I couldn't find any mention.

> [Question to implementation]
> At pgsqlIterateForeignScan(), it applies null-check on festate->tuples
> and bool-checks on festete->cursor_opened.
> Do we have a possible scenario that festate->tuples is not null, but
> festate->cursor_opened, or an opposite combination?
> If null-check on festate->tuples is enough to detect the first call of
> the iterate callback, it is not my preference to have redundant flag.

[ checking... ] No such scenario. It's undesired remain of obsolete
support of simple SELECT mode; pgsql_fdw once had two fetching mode;
simple SELECT mode for small result and CURSOR mode for huge result.
I've removed cursor_opened from PgsqlFdwExecutionState structure.

[Extra change]
In addition to your comments, I found that some regression tests fail
because current planner produces different plan tree from expected
results, and fixed such tests.

Regards,
--
Shigeru Hanada

Attachment Content-Type Size
pgsql_fdw_pushdown_v2.patch text/plain 28.2 KB
pgsql_fdw_v6.patch text/plain 104.3 KB
fdw_helper_v4.patch text/plain 12.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ashutosh Bapat 2012-02-01 12:38:06 Re: Confusing EXPLAIN output in case of inherited tables
Previous Message Simon Riggs 2012-02-01 10:25:58 Refactoring log_newpage