Re: review: FDW API

From: Shigeru HANADA <hanada(at)metrosystems(dot)co(dot)jp>
To: Jan Urbański <wulczer(at)wulczer(dot)org>
Cc: Postgres - Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: review: FDW API
Date: 2011-01-17 13:13:19
Message-ID: 20110117221318.82F2.6989961C@metrosystems.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, 16 Jan 2011 01:55:19 +0100
Jan Urbański <wulczer(at)wulczer(dot)org> wrote:
> what follows is a review of the FDW API patch from
> http://archives.postgresql.org/message-id/20110114212358.82C7.6989961C@metrosystems.co.jp

Thanks for the comments!
For now, I answer to the first half of your comments. I'll answer to
the rest soon.

> All three patches apply cleanly and compile without warnings. Regression
> tests pass.
>
> Let me go patch by patch, starting with the first one that adds the
> HANDLER option.

Sure.

> It adds one useless hunk in src/backend/commands/foreigncmds.c (changes
> order if #includes).
>
> There's a typo in a C commnent ("determin which validator to be used").
>
> Other than that, it looks OK.

Fixed in attached patch.

> The third patch just adds a GetForeignTable helper function and it looks OK.

Thanks. This patch might be able to committed separately because it
would be small enough, and similar to existing lookup functions such
as GetForeignDataWrapper() and GetForeignServer().

> The second patch actually adds the API. First of all, I'd like say that
> it's a really cool piece of code, allowing all kinds of awesome
> funcionality to be added. I'm already excited by the things that this
> will make possible. Congratulations!
>
> To get a feel of the API I wrote a simple FDW wrapper that presents data
> from the commitfest RSS feed, based heavily on twitter_fdw by Hitoshi
> Harada. Please treat my thoughts as someone's who doesn't really know
> *why* the API looks like it does, but has some observations about what
> was missing or what felt strange when using it. I guess that's the
> position a typical FDW wrapper writer will be in.

Sure, I think your point of view is very important.

> First of all, the C comments mention that PlanRelScan should put a tuple
> descriptor in FdwPlan, but there's no such field in it. Also, comments
> for PlanRelScan talk about the 'attnos' argument, which is not in the
> function's signature. I guess the comments are just obsolete and should
> be updated. I think it's actually a good thing you don't have to put a
> TupleDesc in FdwPlan.

Removed comments about 'attnos' and tuple descriptor.

> There are two API methods, PlanNative and PlanQuery that are ifdef'd out
> using IN_THE_FUTURE. Isn't the standard symbol we use NOT_USED? Also,
> the comments say you can implement either PlanRelScan or PlanNative, and
> only the former is available for now. If we keep these methods
> commented, they should be moved to the end of the struct, so that
> uncommenting them will not break compatibility with existing FDWs.

Agreed. Moved ifdef'd part to the end of struct.

> The only documentation a FDW writer has is fdwapi.h, so comments there
> need to be top-notch. We might contemplate writing a documentation
> chapter explaining how FDW handlers should be written, like we do for C
> SRFs and libpq programs, but I guess for now just the headers file would
> be enough.

file_fdw and postgresql_fdw would be good samples for wrapper
developer if we could ship them as contrib modules.

> FdwExecutionState is just a struct around a void pointer, can we imagine
> adding more fields there? If not, maybe we could just remove the
> structure and pass void pointers around? OTOH that gives us some
> compiler checking and possibility of extending the struct, so I guess we
> could also just leave it like that.

ISTM that using a struct as a interface is better than void*, as you
mentioned.

> The Iterate method gets passed a TupleTableSlot. Do we really need such
> a low-level structure? It makes returning the result easy, because you
> just form your tuple and call ExecStoreTuple, but perhaps we could
> abstract that away a bit and add a helper method that will take a tuple
> and call ExecStoreTuple for us, passing InvalidBuffer and false as the
> remaining arguments. Or maybe make Iterate return the tuple and call
> ExecStoreTuple internally? I don't have strong opinions, but
> TupleTableSlot feels a bit too gutty - I'm having a hard time imagining
> what fields from it would be useful for a FDW writer, and so perhaps you
> don't need to expose it.

This would be debatable issue. Currently Iterate() is expected to
return materialized HeapTuple through TupleTableSlot.

I think an advantage to use TupleTableSlot is that FDW can set shoudFree
flag for the tuple.

> Why does BeginScan accept a ParamListInfo argument? First of all, it
> feels like a parsing thing, not a relation scan thing, so perhaps it
> should be available at some other, earlier stage. Second of all, what
> would it be useful for anyway? Neither file_fdw nor my commitfest_fdw
> does anything with it.

ParamListInfo is added to pass parameters of PREPARE/EXECUTE statement
to FDWs.

Plan for a prepared query is generated at PREPARE with placeholders,
and executed at EXECUTE with actual values. With ParamListInfo
parameter for BeginScan(), FDWs would be able to optimize their remote
query with actual parameter values.

Regard,
--
Shigeru Hanada

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2011-01-17 13:15:04 Re: [HACKERS] Fwd: Weird issues when reading UDT from stored function
Previous Message Robert Haas 2011-01-17 13:00:51 Re: Replication logging