Re: [v9.3] writable foreign tables

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: [v9.3] writable foreign tables
Date: 2012-09-24 08:49:16
Message-ID: CADyhKSX1wzwpaMWxC-1t1O9BM8Hhgu6Re2e+=bw2FNcVOZWAtw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/9/23 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> 2012/8/29 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> 2012/8/28 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>>> 2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>>>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>>>>>> Would it be too invasive to introduce a new pointer in TupleTableSlot
>>>>>> that is NULL for anything but virtual tuples from foreign tables?
>>>>
>>>>> I'm not certain whether the duration of TupleTableSlot is enough to
>>>>> carry a private datum between scan and modify stage.
>>>>
>>>> It's not.
>>>>
>>>>> Is it possible to utilize ctid field to move a private pointer?
>>>>
>>>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry the
>>>> TID from scan to modify --- in fact, most of the time what the modify
>>>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>>>> *got* a physical CTID field.
>>>>
>>>> Instead, the planner arranges for the TID to be carried up as an
>>>> explicit resjunk column named ctid. (Currently this is done in
>>>> rewriteTargetListUD(), but see also preptlist.c which does some related
>>>> things for SELECT FOR UPDATE.)
>>>>
>>>> I'm inclined to think that what we need here is for FDWs to be able to
>>>> modify the details of that behavior, at least to the extent of being
>>>> able to specify a different data type than TID for the row
>>>> identification column.
>>>>
>>> Hmm. It seems to me a straight-forward solution rather than ab-use
>>> of ctid system column. Probably, cstring data type is more suitable
>>> to carry a private datum between scan and modify stage.
>>>
>>> One problem I noticed is how FDW driver returns an extra field that
>>> is in neither system nor regular column.
>>> Number of columns and its data type are defined with TupleDesc of
>>> the target foreign-table, so we also need a feature to extend it on
>>> run-time. For example, FDW driver may have to be able to extend
>>> a "virtual" column with cstring data type, even though the target
>>> foreign table does not have such a column.
>>>
>> I tried to investigate the related routines.
>>
>> TupleDesc of TupleTableSlot associated with ForeignScanState
>> is initialized at ExecInitForeignScan as literal.
>> ExecAssignScanType assigns TupleDesc of the target foreign-
>> table on tts_tupleDescriptor, "as-is".
>> It is the reason why IterateForeignScan cannot return a private
>> datum except for the columns being declared as regular ones.
>>
> The attached patch improved its design according to the upthread
> discussion. It now got away from ab-use of "ctid" field, and adopts
> a concept of pseudo-column to hold row-id with opaque data type
> instead.
>
> Pseudo-column is Var reference towards attribute-number larger
> than number of attributes on the target relation; thus, it is not
> a substantial object. It is normally unavailable to reference such
> a larger attribute number because TupleDesc of each ScanState
> associated with a particular relation is initialized at ExecInitNode.
>
> The patched ExecInitForeignScan was extended to generate its
> own TupleDesc including pseudo-column definitions on the fly,
> instead of relation's one, when scan-plan of foreign-table requires
> to have pseudo-columns.
>
> Right now, the only possible pseudo-column is "rowid" being
> injected at rewriteTargetListUD(). It has no data format
> restriction like "ctid" because of VOID data type.
> FDW extension can set an appropriate value on the "rowid"
> field in addition to contents of regular columns at
> IterateForeignScan method, to track which remote row should
> be updated or deleted.
>
> Another possible usage of this pseudo-column is push-down
> of target-list including complex calculation. It may enable to
> move complex mathematical formula into remote devices
> (such as GPU device?) instead of just a reference of Var node.
>
> This patch adds a new interface: GetForeignRelInfo being invoked
> from get_relation_info() to adjust width of RelOptInfo->attr_needed
> according to the target-list which may contain "rowid" pseudo-column.
> Some FDW extension may use this interface to push-down a part of
> target list into remote side, even though I didn't implement this
> feature on file_fdw.
>
> RelOptInfo->max_attr is a good marker whether the plan shall have
> pseudo-column reference. Then, ExecInitForeignScan determines
> whether it should generate a TupleDesc, or not.
>
> The "rowid" is fetched using ExecGetJunkAttribute as we are currently
> doing on regular tables using "ctid", then it shall be delivered to
> ExecUpdate or ExecDelete. We can never expect the fist argument of
> them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
> argument of BR-trigger routines redefined also.
>
> [kaigai(at)iwashi sepgsql]$ cat ~/testfile.csv
> 10 aaa
> 11 bbb
> 12 ccc
> 13 ddd
> 14 eee
> 15 fff
> [kaigai(at)iwashi sepgsql]$ psql postgres
> psql (9.3devel)
> Type "help" for help.
>
> postgres=# UPDATE ftbl SET b = md5(b) WHERE a > 12 RETURNING *;
> INFO: ftbl is the target relation of UPDATE
> INFO: fdw_file: BeginForeignModify method
> INFO: fdw_file: UPDATE (lineno = 4)
> INFO: fdw_file: UPDATE (lineno = 5)
> INFO: fdw_file: UPDATE (lineno = 6)
> INFO: fdw_file: EndForeignModify method
> a | b
> ----+----------------------------------
> 13 | 77963b7a931377ad4ab5ad6a9cd718aa
> 14 | d2f2297d6e829cd3493aa7de4416a18f
> 15 | 343d9040a671c45832ee5381860e2996
> (3 rows)
>
> UPDATE 3
> postgres=# DELETE FROM ftbl WHERE a % 2 = 1 RETURNING *;
> INFO: ftbl is the target relation of DELETE
> INFO: fdw_file: BeginForeignModify method
> INFO: fdw_file: DELETE (lineno = 2)
> INFO: fdw_file: DELETE (lineno = 4)
> INFO: fdw_file: DELETE (lineno = 6)
> INFO: fdw_file: EndForeignModify method
> a | b
> ----+-----
> 11 | bbb
> 13 | ddd
> 15 | fff
> (3 rows)
>
> DELETE 3
>
> In addition, there is a small improvement. ExecForeignInsert,
> ExecForeignUpdate and ExecForeignDelete get being able
> to return number of processed rows; that allows to push-down
> whole the statement into remote-side, if it is enough simple
> (e.g, delete statement without any condition).
>
> Even though it does not make matter right now, pseudo-columns
> should be adjusted when foreign-table is referenced with table
> inheritance feature, because an attribute number being enough
> large in parent table is not enough large in child table.
> We need to fix up them until foreign table feature got inheritance
> capability.
>
> I didn't update the documentation stuff because I positioned
> the state of this patch as proof-of-concept now. Please note that.
>
A tiny bit of this patch was updated. I noticed INTERNAL data type
is more suitable to move a private datum from scan-stage to
modify-stage, because its type-length is declared as SIZEOF_POINTER.

Also, I fixed up some obvious compiler warnings.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment Content-Type Size
pgsql-v9.3-writable-fdw-poc.v3.patch application/octet-stream 46.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2012-09-24 09:00:06 Re: Patch for option in pg_resetxlog for restore from WAL files
Previous Message Heikki Linnakangas 2012-09-24 08:41:01 Re: Configuration include directory