Re: [v9.3] writable foreign tables

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Alexander Korotkov <aekorotkov(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: [v9.3] writable foreign tables
Date: 2012-11-04 08:08:57
Message-ID: CADyhKSX3nkbKXJ3An31Z1wY8q+ZWuakVs+z2LRgsnPJX391yPw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/11/2 Alexander Korotkov <aekorotkov(at)gmail(dot)com>:
> On Mon, Sep 24, 2012 at 12:49 PM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> wrote:
>>
>> 2012/9/23 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> > 2012/8/29 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> >> 2012/8/28 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
>> >>> 2012/8/28 Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>:
>> >>>> Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp> writes:
>> >>>>>> Would it be too invasive to introduce a new pointer in
>> >>>>>> TupleTableSlot
>> >>>>>> that is NULL for anything but virtual tuples from foreign tables?
>> >>>>
>> >>>>> I'm not certain whether the duration of TupleTableSlot is enough to
>> >>>>> carry a private datum between scan and modify stage.
>> >>>>
>> >>>> It's not.
>> >>>>
>> >>>>> Is it possible to utilize ctid field to move a private pointer?
>> >>>>
>> >>>> UPDATEs and DELETEs do not rely on the ctid field of tuples to carry
>> >>>> the
>> >>>> TID from scan to modify --- in fact, most of the time what the modify
>> >>>> step is going to get is a "virtual" TupleTableSlot that hasn't even
>> >>>> *got* a physical CTID field.
>> >>>>
>> >>>> Instead, the planner arranges for the TID to be carried up as an
>> >>>> explicit resjunk column named ctid. (Currently this is done in
>> >>>> rewriteTargetListUD(), but see also preptlist.c which does some
>> >>>> related
>> >>>> things for SELECT FOR UPDATE.)
>> >>>>
>> >>>> I'm inclined to think that what we need here is for FDWs to be able
>> >>>> to
>> >>>> modify the details of that behavior, at least to the extent of being
>> >>>> able to specify a different data type than TID for the row
>> >>>> identification column.
>> >>>>
>> >>> Hmm. It seems to me a straight-forward solution rather than ab-use
>> >>> of ctid system column. Probably, cstring data type is more suitable
>> >>> to carry a private datum between scan and modify stage.
>> >>>
>> >>> One problem I noticed is how FDW driver returns an extra field that
>> >>> is in neither system nor regular column.
>> >>> Number of columns and its data type are defined with TupleDesc of
>> >>> the target foreign-table, so we also need a feature to extend it on
>> >>> run-time. For example, FDW driver may have to be able to extend
>> >>> a "virtual" column with cstring data type, even though the target
>> >>> foreign table does not have such a column.
>> >>>
>> >> I tried to investigate the related routines.
>> >>
>> >> TupleDesc of TupleTableSlot associated with ForeignScanState
>> >> is initialized at ExecInitForeignScan as literal.
>> >> ExecAssignScanType assigns TupleDesc of the target foreign-
>> >> table on tts_tupleDescriptor, "as-is".
>> >> It is the reason why IterateForeignScan cannot return a private
>> >> datum except for the columns being declared as regular ones.
>> >>
>> > The attached patch improved its design according to the upthread
>> > discussion. It now got away from ab-use of "ctid" field, and adopts
>> > a concept of pseudo-column to hold row-id with opaque data type
>> > instead.
>> >
>> > Pseudo-column is Var reference towards attribute-number larger
>> > than number of attributes on the target relation; thus, it is not
>> > a substantial object. It is normally unavailable to reference such
>> > a larger attribute number because TupleDesc of each ScanState
>> > associated with a particular relation is initialized at ExecInitNode.
>> >
>> > The patched ExecInitForeignScan was extended to generate its
>> > own TupleDesc including pseudo-column definitions on the fly,
>> > instead of relation's one, when scan-plan of foreign-table requires
>> > to have pseudo-columns.
>> >
>> > Right now, the only possible pseudo-column is "rowid" being
>> > injected at rewriteTargetListUD(). It has no data format
>> > restriction like "ctid" because of VOID data type.
>> > FDW extension can set an appropriate value on the "rowid"
>> > field in addition to contents of regular columns at
>> > IterateForeignScan method, to track which remote row should
>> > be updated or deleted.
>> >
>> > Another possible usage of this pseudo-column is push-down
>> > of target-list including complex calculation. It may enable to
>> > move complex mathematical formula into remote devices
>> > (such as GPU device?) instead of just a reference of Var node.
>> >
>> > This patch adds a new interface: GetForeignRelInfo being invoked
>> > from get_relation_info() to adjust width of RelOptInfo->attr_needed
>> > according to the target-list which may contain "rowid" pseudo-column.
>> > Some FDW extension may use this interface to push-down a part of
>> > target list into remote side, even though I didn't implement this
>> > feature on file_fdw.
>> >
>> > RelOptInfo->max_attr is a good marker whether the plan shall have
>> > pseudo-column reference. Then, ExecInitForeignScan determines
>> > whether it should generate a TupleDesc, or not.
>> >
>> > The "rowid" is fetched using ExecGetJunkAttribute as we are currently
>> > doing on regular tables using "ctid", then it shall be delivered to
>> > ExecUpdate or ExecDelete. We can never expect the fist argument of
>> > them now, so "ItemPointer tupleid" redefined to "Datum rowid", and
>> > argument of BR-trigger routines redefined also.
>> >
>> > [kaigai(at)iwashi sepgsql]$ cat ~/testfile.csv
>> > 10 aaa
>> > 11 bbb
>> > 12 ccc
>> > 13 ddd
>> > 14 eee
>> > 15 fff
>> > [kaigai(at)iwashi sepgsql]$ psql postgres
>> > psql (9.3devel)
>> > Type "help" for help.
>> >
>> > postgres=# UPDATE ftbl SET b = md5(b) WHERE a > 12 RETURNING *;
>> > INFO: ftbl is the target relation of UPDATE
>> > INFO: fdw_file: BeginForeignModify method
>> > INFO: fdw_file: UPDATE (lineno = 4)
>> > INFO: fdw_file: UPDATE (lineno = 5)
>> > INFO: fdw_file: UPDATE (lineno = 6)
>> > INFO: fdw_file: EndForeignModify method
>> > a | b
>> > ----+----------------------------------
>> > 13 | 77963b7a931377ad4ab5ad6a9cd718aa
>> > 14 | d2f2297d6e829cd3493aa7de4416a18f
>> > 15 | 343d9040a671c45832ee5381860e2996
>> > (3 rows)
>> >
>> > UPDATE 3
>> > postgres=# DELETE FROM ftbl WHERE a % 2 = 1 RETURNING *;
>> > INFO: ftbl is the target relation of DELETE
>> > INFO: fdw_file: BeginForeignModify method
>> > INFO: fdw_file: DELETE (lineno = 2)
>> > INFO: fdw_file: DELETE (lineno = 4)
>> > INFO: fdw_file: DELETE (lineno = 6)
>> > INFO: fdw_file: EndForeignModify method
>> > a | b
>> > ----+-----
>> > 11 | bbb
>> > 13 | ddd
>> > 15 | fff
>> > (3 rows)
>> >
>> > DELETE 3
>> >
>> > In addition, there is a small improvement. ExecForeignInsert,
>> > ExecForeignUpdate and ExecForeignDelete get being able
>> > to return number of processed rows; that allows to push-down
>> > whole the statement into remote-side, if it is enough simple
>> > (e.g, delete statement without any condition).
>> >
>> > Even though it does not make matter right now, pseudo-columns
>> > should be adjusted when foreign-table is referenced with table
>> > inheritance feature, because an attribute number being enough
>> > large in parent table is not enough large in child table.
>> > We need to fix up them until foreign table feature got inheritance
>> > capability.
>> >
>> > I didn't update the documentation stuff because I positioned
>> > the state of this patch as proof-of-concept now. Please note that.
>> >
>> A tiny bit of this patch was updated. I noticed INTERNAL data type
>> is more suitable to move a private datum from scan-stage to
>> modify-stage, because its type-length is declared as SIZEOF_POINTER.
>>
>> Also, I fixed up some obvious compiler warnings.
>

Thanks for your comments.

> I've read previous discussion about this patch. It's generally concentrated
> on the question how to identify foreign table row? Your last patch introduce
> "rowid" pseudo-column for foreign table row identification. My notes are
> following:
> 1) AFAICS your patch are designed to support arbitrary number of
> pseudo-columns while only one is currently used. Do you see more use cases
> of pseudo-columns?
>
Good question. Please see the p.39 of my slide at PGconf.EU 2012 (page of
"TargetList push-down")
(*) http://wiki.postgresql.org/wiki/PostgreSQL_Conference_Europe_Talks_2012

In case when targetList contains complex calculation such as mathematical
operation that takes long CPU cycles, this feature will allows to push down
burden of this calculation into foreign computing resource, not only
implementation
to move identifier of modified rows.
If we can move this calculation into remote RDBMS, all the local pgsql needs
to do is just reference a result value in spite of complex calculation.

> 2) You wrote that FDW can support or don't support write depending on having
> corresponding functions. However it's likely some tables of same FDW could
> be writable while another are not. I think we should have some mechanism for
> FDW telling whether particular table is writable.
>
Hmm. It might be a thing to be considered. My preference is, it should
be handled
at callbacks at planner stage. FDW can raise an error according to individual
foreign table configuration, such as existence of primary key and so on.

The resultRelation of Query can tell whether the scan plan is for
modification of
the table, or not.

> 3) I have another point about identification of foreign rows which I didn't
> meet in previous discussion. What if we restrict writable FDW table to have
> set of column which are unique identifier of a row. Many replication systems
> have this restriction: relicated tables should have a unique key. In case of
> text or csv file I don't see why making line number column user visible is
> bad.
>
The file_fdw portion of this patch is just a proof-of-concept, so I don't think
it is commitable enhancement right now. Thus, here is no special reason
why we don't expose the pseudo line number column for users, however,
it it not a fundamental essence of this patch.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2012-11-04 08:39:16 Re: Unresolved error 0xC0000409 on Windows Server
Previous Message Matthew Gerber 2012-11-04 00:47:08 Re: Unresolved error 0xC0000409 on Windows Server