Re: [v9.3] writable foreign tables

From: Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
To: Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, PgHacker <pgsql-hackers(at)postgresql(dot)org>, Shigeru Hanada <shigeru(dot)hanada(at)gmail(dot)com>
Subject: Re: [v9.3] writable foreign tables
Date: 2012-08-28 08:37:47
Message-ID: CADyhKSVvKz+YKVJ91uBBOZzT1QfAc-QrdtdrjwSfxuXZ0JMDCw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

2012/8/27 Albe Laurenz <laurenz(dot)albe(at)wien(dot)gv(dot)at>:
> Kohei KaiGai wrote:
>> 2012/8/25 Robert Haas <robertmhaas(at)gmail(dot)com>:
>>> On Thu, Aug 23, 2012 at 1:10 AM, Kohei KaiGai <kaigai(at)kaigai(dot)gr(dot)jp>
> wrote:
>>>> It is a responsibility of FDW extension (and DBA) to ensure each
>>>> foreign-row has a unique identifier that has 48-bits width integer
>>>> data type in maximum.
>
>>> It strikes me as incredibly short-sighted to decide that the row
>>> identifier has to have the same format as what our existing heap AM
>>> happens to have. I think we need to allow the row identifier to be
> of
>>> any data type, and even compound. For example, the foreign side
> might
>>> have no equivalent of CTID, and thus use primary key. And the
> primary
>>> key might consist of an integer and a string, or some such.
>
>> I assume it is a task of FDW extension to translate between the pseudo
>> ctid and the primary key in remote side.
>>
>> For example, if primary key of the remote table is Text data type, an
> idea
>> is to use a hash table to track the text-formed primary being
> associated
>> with a particular 48-bits integer. The pseudo ctid shall be utilized
> to track
>> the tuple to be modified on the scan-stage, then FDW can reference the
>> hash table to pull-out the primary key to be provided on the prepared
>> statement.
>
> And what if there is a hash collision? Then you would not be able to
> determine which row is meant.
>
Even if we had a hash collision, each hash entry can have the original
key itself to be compared. But anyway, I love the idea to support
an opaque pointer to track particular remote-row rather.

> I agree with Robert that this should be flexible enough to cater for
> all kinds of row identifiers. Oracle, for example, uses ten byte
> identifiers which would give me a headache with your suggested design.
>
>> Do we have some other reasonable ideas?
>
> Would it be too invasive to introduce a new pointer in TupleTableSlot
> that is NULL for anything but virtual tuples from foreign tables?
>
I'm not certain whether the duration of TupleTableSlot is enough to
carry a private datum between scan and modify stage.
For example, the TupleTableSlot shall be cleared at ExecNestLoop
prior to the slot being delivered to ExecModifyTuple.

postgres=# EXPLAIN UPDATE t1 SET b = 'abcd' WHERE a IN (SELECT x FROM
t2 WHERE x % 2 = 0);
QUERY PLAN
-------------------------------------------------------------------------------
Update on t1 (cost=0.00..54.13 rows=6 width=16)
-> Nested Loop (cost=0.00..54.13 rows=6 width=16)
-> Seq Scan on t2 (cost=0.00..28.45 rows=6 width=10)
Filter: ((x % 2) = 0)
-> Index Scan using t1_pkey on t1 (cost=0.00..4.27 rows=1 width=10)
Index Cond: (a = t2.x)
(6 rows)

Is it possible to utilize ctid field to move a private pointer?
TID data type is internally represented as a pointer to ItemPointerData,
so it has enough width to track an opaque formed remote-row identifier;
including string, int64 or others.

One disadvantage is "ctid" system column shows a nonsense value
when user explicitly references this system column. But it does not
seems to me a fundamental problem, because we didn't give any
special meaning on the "ctid" field of foreign table.

Thanks,
--
KaiGai Kohei <kaigai(at)kaigai(dot)gr(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kohei KaiGai 2012-08-28 09:07:31 Re: [v9.3] writable foreign tables
Previous Message Ants Aasma 2012-08-28 08:24:42 Re: Timing overhead and Linux clock sources