Re: Oddity in tuple routing for foreign partitions

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Oddity in tuple routing for foreign partitions
Date: 2018-04-18 12:55:28
Message-ID: 5AD74040.9010403@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2018/04/18 14:44), Amit Langote wrote:
> On 2018/04/17 16:41, Etsuro Fujita wrote:
>> In the INSERT/COPY-tuple-routing case, as explained by Amit, the
>> RTE at that position in the EState's range table is the one for the
>> partitioned table of a given partition, so the statement would be true.
>> BUT in the UPDATE-tuple-routing case, the RTE is the one for the given
>> partition, not for the partitioned table, so the statement would not be
>> true. In the latter case, we don't need to create a child RTE and replace
>> the original RTE with it, but I handled both cases the same way for
>> simplicity.
>
> Oh, I didn't really consider this part carefully. That the resultRelInfo
> received by BeginForeignInsert (when called by ExecInitRoutingInfo) could
> be a reused UPDATE result relation. It might be possible to justify the
> parent_rte/child_rte terminology by explaining it a bit better.

Yeah, I think that terminology would be confusing, so I changed it to
old_rte/new_rte.

> I see
> three cases that arise during tuple routing:
>
> 1. This is INSERT and so the resultRelation that's received in
> BeginForeignInsert has been freshly created in ExecInitPartitionInfo
> and it bears node->nominalRelation or 1 as its ri_RangeTableIndex

Right.

> 2. This is UPDATE and the resultRelInfo that's received in
> BeginForeignInsert has been freshly created in ExecInitPartitionInfo
> and it bears node->nominalRelation or 1 as its ri_RangeTableIndex

In that case, we have a valid plan node, so it would only bear
node->nominalRelation.

> 3. This is UPDATE and the resultRelInfo that's received in
> BeginForeignInsert is a reused one, in which case, it bears the planner
> assigned ri_RangeTableIndex

Right.

> In all three cases, I think we can rely on using ri_RangeTableIndex to
> fetch a valid "parent" RTE from es_range_table.

I slept on this, I noticed there is another bug in case #2. Here is an
example with the previous patch:

postgres=# create table utrtest (a int, b text) partition by list (a);
postgres=# create table loct (a int check (a in (1)), b text);
postgres=# create foreign table remp (a int check (a in (1)), b text)
server loopback options (table_name 'loct');
postgres=# create table locp (a int check (a in (2)), b text);
postgres=# alter table utrtest attach partition remp for values in (1);
postgres=# alter table utrtest attach partition locp for values in (2);
postgres=# create trigger loct_br_insert_trigger before insert on loct
for each row execute procedure br_insert_trigfunc();

where the trigger function is the one defined in an earlier email.

postgres=# insert into utrtest values (2, 'qux');
INSERT 0 1
postgres=# update utrtest set a = 1 where a = 2 returning *;
a | b
---+-----
1 | qux
(1 row)

UPDATE 1

Wrong result! The result should be concatenated with ' triggered !'.

In case #2, since we initialize expressions for the partition's
ResultRelInfo including RETURNING by translating the attnos of the
corresponding expressions in the ResultRelInfo for the first subplan
target rel, I think we should replace the RTE for the first subplan
target rel, not the RTE for the nominal rel, with the new one created
for the foreign table. Attached is an updated version for fixing this
issue.

> Do you think we need to clarify this in a comment?

Yeah, but I updated comments about this a little bit different way in
the attached. Does that make sense?

Thanks for the comments!

Best regards,
Etsuro Fujita

Attachment Content-Type Size
fix-tuple-routing-for-foreign-partitions-2.patch text/x-diff 15.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2018-04-18 13:09:50 Re: Built-in connection pooling
Previous Message Craig Ringer 2018-04-18 12:45:53 Re: PostgreSQL's handling of fsync() errors is unsafe and risks data loss at least on XFS