Re: Problem while updating a foreign table pointing to a partitioned table on foreign server

From: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>
To: ashutosh(dot)bapat(at)enterprisedb(dot)com
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Problem while updating a foreign table pointing to a partitioned table on foreign server
Date: 2018-04-18 04:13:17
Message-ID: 20180418.131317.164571488.horiguchi.kyotaro@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hello.

At Mon, 16 Apr 2018 17:05:28 +0530, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote in <CAFjFpRfcgwsHRmpvoOK-GUQi-n8MgAS+OxcQo=aBDn1COywmcg(at)mail(dot)gmail(dot)com>
> Hi,
> Consider this scenario
>
> postgres=# CREATE TABLE plt (a int, b int) PARTITION BY LIST(a);
> postgres=# CREATE TABLE plt_p1 PARTITION OF plt FOR VALUES IN (1);
> postgres=# CREATE TABLE plt_p2 PARTITION OF plt FOR VALUES IN (2);
> postgres=# INSERT INTO plt VALUES (1, 1), (2, 2);
> postgres=# CREATE FOREIGN TABLE fplt (a int, b int) SERVER loopback
> OPTIONS (table_name 'plt');
> postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
> tableoid | ctid | a | b
> ----------+-------+---+---
> fplt | (0,1) | 1 | 1
> fplt | (0,1) | 2 | 2
> (2 rows)
>
> -- Need to use random() so that following update doesn't turn into a
> direct UPDATE.
> postgres=# EXPLAIN (VERBOSE, COSTS OFF)
> postgres-# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
> 20 END) WHERE a = 1;
> QUERY PLAN
> --------------------------------------------------------------------------------------------
> Update on public.fplt
> Remote SQL: UPDATE public.plt SET b = $2 WHERE ctid = $1
> -> Foreign Scan on public.fplt
> Output: a, CASE WHEN (random() <= '1'::double precision) THEN
> 10 ELSE 20 END, ctid
> Remote SQL: SELECT a, ctid FROM public.plt WHERE ((a = 1)) FOR UPDATE
> (5 rows)
>
> postgres=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE
> 20 END) WHERE a = 1;
> postgres=# SELECT tableoid::regclass, ctid, * FROM fplt;
> tableoid | ctid | a | b
> ----------+-------+---+----
> fplt | (0,2) | 1 | 10
> fplt | (0,2) | 2 | 10
> (2 rows)
>
> We expect only 1 row with a = 1 to be updated, but both the rows get
> updated. This happens because both the rows has ctid = (0, 1) and
> that's the only qualification used for UPDATE and DELETE. Thus when a
> non-direct UPDATE is run on a foreign table which points to a
> partitioned table or inheritance hierarchy on the foreign server, it
> will update rows from all child table which have ctids same as the
> qualifying rows. Same is the case with DELETE.

Anyway I think we should warn or error out if one nondirect
update touches two nor more tuples in the first place.

=# UPDATE fplt SET b = (CASE WHEN random() <= 1 THEN 10 ELSE 20 END) WHERE a = 1;
ERROR: updated 2 rows for a tuple identity on the remote end

> There are two ways to fix this
> 1. Use WHERE CURRENT OF with cursors to update rows. This means that
> we fetch only one row at a time and update it. This can slow down the
> execution drastically.
> 2. Along with ctid use tableoid as a qualifier i.e. WHERE clause of
> UPDATE/DELETE statement has ctid = $1 AND tableoid = $2 as conditions.
>
> PFA patch along the lines of 2nd approach and along with the
> testcases. The idea is to inject tableoid attribute to be fetched from
> the foreign server similar to ctid and then add it to the DML
> statement being constructed.
>
> It does fix the problem. But the patch as is interferes with the way
> we handle tableoid currently. That can be seen from the regression
> diffs that the patch causes. RIght now, every tableoid reference gets
> converted into the tableoid of the foreign table (and not the tableoid
> of the foreign table). Somehow we need to differentiate between the
> tableoid injected for DML and tableoid references added by the user in
> the original query and then use tableoid on the foreign server for the
> first and local foreign table's oid for the second. Right now, I don't
> see a simple way to do that.

We cannot add no non-system (junk) columns not defined in foreign
table columns. We could pass tableoid via a side channel but we
get wrong value if the scan is not consists of only one foreign
relation. I don't think adding remote_tableoid in HeapTupleData
is acceptable. Explicity defining remote_tableoid column in
foreign relation might work but it makes things combersome..

regards,

--
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
fdw_error_out_multiple_update.patch text/x-patch 657 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-04-18 04:13:38 Re: ON CONFLICT DO UPDATE for partitioned tables
Previous Message Amit Langote 2018-04-18 02:52:41 Re: pruning disabled for array, enum, record, range type partition keys