Re: [v9.3] writable foreign tables

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov *EXTERN*" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: [v9.3] writable foreign tables
Date: 2012-11-16 20:09:05
Message-ID: CADyhKSVPY8imSVo9frV8wm2v5gh+e_Ddq-zWpNd7Y1dPezH1LA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/11/16 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>> The attached patch is just a refreshed version for clean applying to
>> the latest tree.
>>
>> As previous version doing, it makes pseudo enhancement on file_fdw
>> to print something about the supplied tuple on INSERT, UPDATE and
>> DELETE statement.
>
> Basics:
> -------
>
> The patch applies cleanly, compiles without warnings and passes
> regression tests.
>
> I think that the functionality is highly desirable; judging from
> the number of talks at pgConf EU about SQL/MED this is a hot
> topic, and further development is welcome.
>
> Testing the functionality:
> --------------------------
>
> I ran a few queries with the file_fdw and found this:
>
> $ cat flatfile
> 1,Laurenz,1968-10-20
> 2,Renée,1975-09-03
> 3,Caroline,2009-01-26
> 4,Ray,2011-03-09
> 5,Stephan,2011-03-09
>
> CREATE SERVER file FOREIGN DATA WRAPPER file_fdw;
>
> CREATE FOREIGN TABLE flat(
> id integer not null,
> name varchar(20) not null,
> birthday date not null
> ) SERVER file
> OPTIONS (filename 'flatfile', format 'csv');
>
> UPDATE flat SET name='' FROM flat f WHERE f.id = flat.id and f.name like '%e';
> ERROR: bitmapset is empty
>
Hmm... I'll try to investigate the behavior.

> About the code:
> ---------------
>
> I am not so happy with GetForeignRelInfo:
> - The name seems ill-chosen from the FDW API side.
> I guess that you chose the name because the function
> is called from get_relation_info, but I think the name
> should be more descriptive for the FDW API.
> Something like PlanForeignRelRowid.
>
Indeed, GetForeignRelInfo might give misleading impression
as if this routine collects widespread information about
target relation. So, how about GetForeignRelWidth() instead?

> - I guess that every FDW that only needs "rowid" will
> do exactly the same as your fileGetForeignRelInfo.
> Why can't that be done in core?
> The function could pass an AttrNumber for the rowid
> to the FDW, and will receive a boolean return code
> depending on whether the FDW plans to use rowid or not.
> That would be more convenient for FDW authors.
>
This design tries to kill two-birds with one-stone.
It enables to add multiple number of pseudo-columns,
not only "rowid", and makes possible to push-down
complex calculation of target list into external computing
resource.

For example, when user gives the following query:

SELECT ((c1 - c2) * (c2 - c3))^2 FROM ftable

it contains a complex calculation in the target-list,
thus, it also takes CPU cycles of local process.

If we can replace the "((c1 - c2) * (c2 - c3))^2" by
a reference to a pseudo-column that also references
the calculation result on external node, it effectively
off-load CPU cycles.

In this case, all we need to do is (1) acquire a slot
for pseudo-column at GetForeignRelInfo (2) replace
TargetEntry::expr by Var node that reference this
pseudo-column.

It makes sense for performance optimization, so I don't
want to restrict this handler for "rowid" only.

> - I guess the order is dictated by planner steps, but
> it would be "nice to have" if GetForeignRelInfo were
> not the first function to be called during planning.
> That would make it easier for a FDW to support both
> 9.2 and 9.3 (fewer #ifdefs), because the FDW plan state
> will probably have to be created in the first API
> function.
>
The baserel->fdw_private should be initialized to NULL,
so it can perform as a mark whether private data is already
constructed, or not.

> I also wonder why you changed the signature of functions in
> trigger.c. It changes an exposed API unnecessarily, unless
> you plan to have trigger support for foreign tables.
> That is currently not supported, and I think that such
> changes should be part of the present patch.
>
You are right. It might be unneeded change right now.
I'll revert it.

> Other than that, I like the patch.
> I cannot give an educated opinion if the changes to planner
> and executor are well done or not.
>
> Other comments:
> ---------------
>
> I hope I find time to implement the API in oracle_fdw to
> be able to test it more thoroughly.
>
> There is an interdependence with the "FDW for PostgreSQL" patch
> in the commitfest. If both these patches get committed, the
> FDW should be extended to support the new API. If nothing else,
> this would greatly improve the ability to test the new API
> and find out if it is well designed.
>
It is the reason why I'd like to volunteer to review Hanada-san's patch.
I'll spent my time for reviewing his patch on this weekend.

>> Here is one other idea. My GPU acceleration module (PG-Strom)
>> implements column-oriented data store underlying foreign table.
>> It might make sense to cut out this portion for proof-of-concept of
>> writable foreign tables.
>>
>> Any ideas?
>
> The best would be to have a patch on top of the PostgreSQL FDW
> to be able to test. It need not be perfect.
>
OK,

In addition, I noticed my patch didn't update documentation stuff.
I also add mention about new handlers.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Antonin Houska 2012-11-16 21:02:05 Re: Materialized views WIP patch
Previous Message Atri Sharma 2012-11-16 19:42:26 Re: WIP patch for hint bit i/o mitigation