Quick Links

Re: [v9.3] writable foreign tables

From:	Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To:	Ronan Dunklau <rdunklau(at)gmail(dot)com>
Cc:	Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>, Atri Sharma <atri(dot)jiit(at)gmail(dot)com>, "Alexander Korotkov EXTERN" <aekorotkov(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject:	Re: [v9.3] writable foreign tables
Date:	2012-12-23 11:30:59
Message-ID:	CADyhKSXnbm1kvn9H7SxVxckuJn+fLPcY9SVZJWjimk4zaHPYkg@mail.gmail.com
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

The attached patches are revised version for before-row triggers or
default setting.

I adjusted APIs to allow FDW drivers to modify the supplied slot
according to the row
being actually inserted / updated on the remote side.
Now, relevant handlers are declared as follows:

TupleTableSlot *
ExecForeignInsert (ResultRelInfo *resultRelInfo,
TupleTableSlot *slot);

TupleTableSlot *
ExecForeignUpdate (ResultRelInfo *resultRelInfo,
const char *rowid,
TupleTableSlot *slot);

bool
ExecForeignDelete (ResultRelInfo *resultRelInfo,
const char *rowid);

If nothing were modified, insert or update handler usually returns the supplied
slot as is. Elsewhere, it can return a modified image of tuple according to the
result of remote query execution, including default setting or
before-row trigger.

Let's see the following examples with enhanced postgres_fdw.

postgres=# CREATE FOREIGN TABLE ft1 (a int, b text, c date) SERVER
loopback OPTIONS(relname 't1');
CREATE FOREIGN TABLE
postgres=# SELECT * FROM ft1;
a | b | c
---+-----+------------
1 | aaa | 2012-12-10
2 | bbb | 2012-12-15
3 | ccc | 2012-12-20
(3 rows)

==> ft1 is a foreign table behalf of t1 on loopback server.

postgres=# ALTER TABLE t1 ALTER c SET DEFAULT current_date;
ALTER TABLE
postgres=# CREATE OR REPLACE FUNCTION t1_trig_func() RETURNS trigger AS $$
postgres$# BEGIN
postgres$# NEW.b = NEW.b || '_trig_update';
postgres$# RETURN NEW;
postgres$# END;
postgres$# $$ LANGUAGE plpgsql;
CREATE FUNCTION
postgres=# CREATE TRIGGER t1_br_trig BEFORE INSERT OR UPDATE ON t1 FOR
EACH ROW EXECUTE PROCEDURE t1_trig_func();
CREATE TRIGGER

==> The "t1" table has default setting and before-row triggers.

postgres=# INSERT INTO ft1 VALUES (4,'ddd','2012-12-05'),
(5,'eee',null) RETURNING *;
a | b | c
---+-----------------+------------
4 | ddd_trig_update | 2012-12-05
5 | eee_trig_update |
(2 rows)

INSERT 0 2
postgres=# INSERT INTO ft1 VALUES (6, 'fff') RETURNING *;
a | b | c
---+-----------------+------------
6 | fff_trig_update | 2012-12-23
(1 row)

INSERT 0 1

==> RETURNING clause can back modified image on the remote side.
Please note that the results are affected by default-setting and
before-row procedure on remote side.

postgres=# UPDATE ft1 SET a = a + 100 RETURNING *;
a | b | c
-----+-----------------------------+------------
101 | aaa_trig_update | 2012-12-10
102 | bbb_trig_update | 2012-12-15
103 | ccc_trig_update | 2012-12-20
104 | ddd_trig_update_trig_update | 2012-12-05
105 | eee_trig_update_trig_update |
106 | fff_trig_update_trig_update | 2012-12-23
(6 rows)

UPDATE 6

==> update command also

The modified API lost a feature to return the number of rows being affected
that might be useful in case when underlying query multiple rows on the
remote side. However, I rethought it does not really make sense.
If FDW driver support a feature to push down whole UPDATE or DELETE
command towards simple queries, it is not so difficult to cache the result
of the query that affect multiple rows, then the relevant routine can pop
the cached result for each invocation without remote query execution on
demand.

Thanks,

2012/12/18 Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>:
> Hi,
>
> 2012/12/18 Ronan Dunklau <rdunklau(at)gmail(dot)com>:
>> Hello.
>>
>> I've tried to implement this API for our Multicorn FDW, and I have a few
>> questions about the API.
>>
>> First, I don't understand what are the requirements on the "rowid"
>> pseudo-column values.
>>
>> In particular, this sentence from a previous mail makes it ambiguous to me:
>>
>>
>>> At the beginning, I constructed the rowid pseudo-column using
>>> INTERNALOID, but it made troubles when fetched tuples are
>>> materialized prior to table modification.
>>> So, the "rowid" argument of them are re-defined as "const char *".
>>
>> Does that mean that one should only store a cstring in the rowid
>> pseudo-column ? Or can one store an arbitrary pointer ? Currently, I'm
>> building a text value using cstring_to_text_with_len. Could there be a
>> problem with that ?
>>
> All what we require on the rowid pseudo-column is it has capability to
> identify a particular row on the remote side. In case of postgres_fdw,
> ctid of the relevant table is sufficient for the purpose.
> I don't recommend to set the rowid field an arbitrary pointer, because
> scanned tuple may be materialized between scanning and modifying
> foreign table, thus, the arbitrary pointer shall be dealt under the
> assumption of cstring data type.
>
>> Secondly, how does one use a returning clause ?
>> I've tried to look at the postgres_fdw code, but it does not seem to handle
>> that.
>>
>> For what its worth, knowing that the postgres_fdw is still in WIP status,
>> here is a couple result of my experimentation with it:
>>
>> - Insert into a foreign table mapped to a table with a "before" trigger,
>> using a returning clause, will return the values as they were before being
>> modified by the trigger.
>> - Same thing, but if the trigger prevents insertion (returning NULL), the
>> "would-have-been" inserted row is still returned, although the insert
>> statement reports zero rows.
>>
> Hmm. Do you want to see the "real" final contents of the rows being inserted,
> don't you. Yep, the proposed interface does not have capability to modify
> the supplied tuple on ExecForeignInsert or other methods.
>
> Probably, it needs to adjust interface to allow FDW drivers to return
> a modified HeapTuple or TupleTableSlot for RETURNING calculation.
> (As trigger doing, it can return the given one as-is if no change)
>
> Let me investigate the code around this topic.
>
>> - Inserting into a table with default values does not work as intended,
>> since missing values are replaced by a null value in the remote statement.
>>
> It might be a bug of my proof-of-concept portion at postgres_fdw.
> The prepared INSERT statement should list up columns being actually
> used only, not all ones unconditionally.
>
> Thanks,
>
>> What can be done to make the behaviour more consistent ?
>>
>> I'm very excited about this feature, thank you for making this possible.
>>
>> Regards,
>> --
>> Ronan Dunklau
>>
>> 2012/12/14 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
>>>
>>> Kohei KaiGai wrote:
>>> >> I came up with one more query that causes a problem:
>>> [...]
>>> >> This causes a deadlock, but one that is not detected;
>>> >> the query just keeps hanging.
>>> >>
>>> >> The UPDATE in the CTE has the rows locked, so the
>>> >> SELECT ... FOR UPDATE issued via the FDW connection will hang
>>> >> indefinitely.
>>> >>
>>> >> I wonder if that's just a pathological corner case that we shouldn't
>>> >> worry about. Loopback connections for FDWs themselves might not
>>> >> be so rare, for example as a substitute for autonomous subtransactions.
>>> >>
>>> >> I guess it is not easily possible to detect such a situation or
>>> >> to do something reasonable about it.
>>> >
>>> > It is not avoidable problem due to the nature of distributed database
>>> > system,
>>> > not only loopback scenario.
>>> >
>>> > In my personal opinion, I'd like to wait for someone implements
>>> > distributed
>>> > lock/transaction manager on top of the background worker framework being
>>> > recently committed, to intermediate lock request.
>>> > However, it will take massive amount of efforts to existing
>>> > lock/transaction
>>> > management layer, not only enhancement of FDW APIs. It is obviously out
>>> > of scope in this patch.
>>> >
>>> > So, I'd like to suggest authors of FDW that support writable features to
>>> > put
>>> > mention about possible deadlock scenario in their documentation.
>>> > At least, above writable CTE example is a situation that two different
>>> > sessions
>>> > concurrently update the "test" relation, thus, one shall be blocked.
>>>
>>> Fair enough.
>>>
>>> >> I tried to overhaul the documentation, see the attached patch.
>>> >>
>>> >> There was one thing that I was not certain of:
>>> >> You say that for writable foreign tables, BeginForeignModify
>>> >> and EndForeignModify *must* be implemented.
>>> >> I thought that these were optional, and if you can do your work
>>> > with just, say, ExecForeignDelete, you could do that.
>>> >
>>> > Yes, that's right. What I wrote was incorrect.
>>> > If FDW driver does not require any state during modification of
>>> > foreign tables, indeed, these are not mandatory handler.
>>>
>>> I have updated the documentation, that was all I changed in the
>>> attached patches.
>>>
>>> > OK. I split the patch into two portion, part-1 is the APIs relevant
>>> > patch, part-2 is relevant to postgres_fdw patch.
>>>
>>> Great.
>>>
>>> I'll mark the patch as "ready for committer".
>>>
>>> Yours,
>>> Laurenz Albe
>>>
>>>
>>> --
>>> Sent via pgsql-hackers mailing list (pgsql-hackers(at)postgresql(dot)org)
>>> To make changes to your subscription:
>>> http://www.postgresql.org/mailpref/pgsql-hackers
>>>
>>
>
>
>
> --
> KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

Attachment	Content-Type	Size
pgsql-v9.3-writable-fdw-poc.v10.part-2.patch	application/octet-stream	129.0 KB
pgsql-v9.3-writable-fdw-poc.v10.part-1.patch	application/octet-stream	51.3 KB

In response to

Re: [v9.3] writable foreign tables at 2012-12-18 21:39:55 from Kohei KaiGai

Responses

Re: [v9.3] writable foreign tables at 2013-01-29 09:19:42 from Kohei KaiGai

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Fujii Masao	2012-12-23 13:33:15	Re: pg_basebackup from cascading standby after timeline switch
Previous Message	Marko Kreen	2012-12-23 09:58:40	Re: pgcrypto seeding problem when ssl=on