Re: bug in update tuple routing with foreign partitions

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: bug in update tuple routing with foreign partitions
Date: 2019-04-18 13:10:06
Message-ID: 5CB8772E.9020006@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Amit-san,

Thanks for the comments!

(2019/04/18 14:08), Amit Langote wrote:
> On 2019/04/18 14:06, Amit Langote wrote:
>> On 2019/04/17 21:49, Etsuro Fujita wrote:

>>> I think the reason for that is: the row routed to remp is incorrectly
>>> fetched as a to-be-updated row when updating remp as a subplan targetrel.
>>
>> Yeah. In the fully-local case, that is, where both the source and the
>> target partition of a row movement operation are local tables, heap AM
>> ensures that tuples that's moved into a given relation in the same command
>> (by way of row movement) are not returned as to-be-updated, because it
>> deems such tuples invisible. The "same command" part being crucial for
>> that to work.
>>
>> In the case where the target of a row movement operation is a foreign
>> table partition, the INSERT used as part of row movement and subsequent
>> UPDATE of the same foreign table are distinct commands for the remote
>> server. So, the rows inserted by the 1st command (as part of the row
>> movement) are deemed visible by the 2nd command (UPDATE) even if both are
>> operating within the same transaction.

Yeah, I think so too.

>> I guess there's no easy way for postgres_fdw to make the remote server
>> consider them as a single command. IOW, no way to make the remote server
>> not touch those "moved-in" rows during the UPDATE part of the local query.

I agree.

>>> The right way to fix this would be to have some way in postgres_fdw in
>>> which we don't fetch such rows when updating such subplan targetrels. I
>>> tried to figure out a (simple) way to do that, but I couldn't.
>>
>> Yeah, that seems a bit hard to ensure with our current infrastructure.

Yeah, I think we should leave that for future work.

>>> So what I'm thinking is to throw an error for cases like this. (Though, I
>>> think we should keep to allow rows to be moved from local partitions to a
>>> foreign-table subplan targetrel that has been updated already.)
>>
>> Hmm, how would you distinguish (totally inside postgred_fdw I presume) the
>> two cases?

One thing I came up with to do that is this:

@@ -1917,6 +1937,23 @@ postgresBeginForeignInsert(ModifyTableState *mtstate,

initStringInfo(&sql);

+ /*
+ * If the foreign table is an UPDATE subplan resultrel that
hasn't yet
+ * been updated, routing tuples to the table might yield incorrect
+ * results, because if routing tuples, routed tuples will be
mistakenly
+ * read from the table and updated twice when updating the table
--- it
+ * would be nice if we could handle this case; but for now,
throw an error
+ * for safety.
+ */
+ if (plan && plan->operation == CMD_UPDATE &&
+ (resultRelInfo->ri_usesFdwDirectModify ||
+ resultRelInfo->ri_FdwState) &&
+ resultRelInfo > mtstate->resultRelInfo +
mtstate->mt_whichplan)
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("cannot route tuples into
foreign table to be updated \"%s\"",
+
RelationGetRelationName(rel))));

> Forgot to say that since this is a separate issue from the original bug
> report, maybe we can first finish fixing the latter. What to do you think?

Yeah, I think we can do that, but my favorite would be to fix the two
issues in a single shot, because they seem to me rather close to each
other. So I updated a new version in a single patch, which I'm attaching.

Notes:

* I kept all the changes in the previous patch, because otherwise
postgres_fdw would fail to release resources for a foreign-insert
operation created by postgresBeginForeignInsert() for a tuple-routable
foreign table (ie, a foreign-table subplan resultrel that has been
updated already) during postgresEndForeignInsert().

* I revised a comment according to your previous comment, though I
changed a state name: s/sub_fmstate/aux_fmstate/

* DirectModify also has the latter issue. Here is an example:

postgres=# create table p (a int, b text) partition by list (a);
postgres=# create table p1 partition of p for values in (1);
postgres=# create table p2base (a int check (a = 2 or a = 3), b text);
postgres=# create foreign table p2 partition of p for values in (2, 3)
server loopback options (table_name 'p2base');

postgres=# insert into p values (1, 'foo');
INSERT 0 1
postgres=# explain verbose update p set a = a + 1;
QUERY PLAN
-----------------------------------------------------------------------------
Update on public.p (cost=0.00..176.21 rows=2511 width=42)
Update on public.p1
Foreign Update on public.p2
-> Seq Scan on public.p1 (cost=0.00..25.88 rows=1270 width=42)
Output: (p1.a + 1), p1.b, p1.ctid
-> Foreign Update on public.p2 (cost=100.00..150.33 rows=1241
width=42)
Remote SQL: UPDATE public.p2base SET a = (a + 1)
(7 rows)

postgres=# update p set a = a + 1;
UPDATE 2
postgres=# select * from p;
a | b
---+-----
3 | foo
(1 row)

As shown in the above bit added to postgresBeginForeignInsert(), I
modified the patch so that we throw an error for cases like this even
when using a direct modification plan, and I also added regression test
cases for that.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
HEAD-bug-update-tuple-routing-with-foreign-parts-efujita-2.patch text/x-patch 12.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-04-18 13:51:07 Re: proposal: psql PSQL_TABULAR_PAGER variable
Previous Message Kyotaro HORIGUCHI 2019-04-18 12:02:57 Remove page-read callback from XLogReaderState.