Re: Oddity in tuple routing for foreign partitions

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Oddity in tuple routing for foreign partitions
Date: 2018-04-17 07:41:31
Message-ID: 5AD5A52B.7050206@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2018/04/17 16:10), Amit Langote wrote:
> On 2018/04/17 11:13, Kyotaro HORIGUCHI wrote:
>> If I'm reading this correctly, ExecInitParititionInfo calls
>> ExecInitRoutingInfo so currently CheckValidityResultRel is called
>> for the child when partrel is created in ExecPrepareTupleRouting.
>> But the move of CheckValidResultRel results in letting partrel
>> just created in ExecPrepareTupleRouting not be checked at all
>> since ri_ParititionReadyForRouting is always set true in
>> ExecInitPartitionInfo.
>
> I thought the same upon reading the email, but it seems that the patch
> does add CheckValidResultRel() in ExecInitPartitionInfo() as well:
>
> @@ -335,6 +335,13 @@ ExecInitPartitionInfo(ModifyTableState *mtstate,
> estate->es_instrument);
>
> /*
> + * Verify result relation is a valid target for an INSERT. An UPDATE
> of a
> + * partition-key becomes a DELETE+INSERT operation, so this check is
> still
> + * required when the operation is CMD_UPDATE.
> + */
> + CheckValidResultRel(leaf_part_rri, CMD_INSERT);
> +
> + /*
> * Since we've just initialized this ResultRelInfo, it's not in any list
> * attached to the estate as yet. Add it, so that it can be found later.
> *

That's right. So, the validity check would be applied to a partition
created in ExecPrepareTupleRouting as well.

>>>> Another
>>>> thing I noticed is the RT index of the foreign partition set to 1 in
>>>> postgresBeginForeignInsert to create a remote query; that would not work
>>>> for cases where the partitioned table has an RT index larger than 1 (eg,
>>>> the partitioned table is not the primary ModifyTable node); in which
>>>> cases, since the varno of Vars belonging to the foreign partition in the
>>>> RETURNING expressions is the same as the partitioned table, calling
>>>> deparseReturningList with the RT index set to 1 against the RETURNING
>>>> expressions would produce attrs_used to be NULL, leading to postgres_fdw
>>>> not retrieving actually inserted data from the remote, even if remote
>>>> triggers might change those data. So, I fixed this as well, by setting
>>>> the RT index accordingly to the partitioned table.
>>>
>>> Check.
>>
>> I'm not sure but is it true that the parent is always at
>> resultRelation - 1?
>
> Unless I'm missing something, EState.es_range_table contains *all* range
> table entries that were created by the planner. So, if the planner
> assigned resultRelation as the RT index for the parent, then it seems we
> can rely on getting the same entry back with
> list_nth_cell(estate->es_range_table, resultRelation - 1). That seems
> true even if the parent wasn't the *primary* target relation. For
> example, the following works with the patch:
>
> -- in addition to the tables shown in Fujita-san's email, define one more
> -- local partition
> create table locp2 partition of itrtest for values in (2);
>
> -- simple case
> insert into itrtest values (1, 'foo') returning *;
> a | b
> ---+-----------------
> 1 | foo triggered !
> (1 row)
>
> -- itrtest (the parent) appears as both the primary target relation and
> -- otherwise. The latter in the form of being mentioned in the with query
>
> with ins (a, b) as (insert into itrtest values (1, 'foo') returning *)
> insert into itrtest select a + 1, b from ins returning *;
> a | b
> ---+-----------------
> 2 | foo triggered !
> (1 row)

I agree with that. Thanks for the explanation and the example, Amit!

Maybe my explanation (or the naming parent_rte/child_rte in the patch)
was not necessarily good, so I guess that that confused horiguchi-san.
Let me explain. In the INSERT/COPY-tuple-routing case, as explained by
Amit, the RTE at that position in the EState's range table is the one
for the partitioned table of a given partition, so the statement would
be true. BUT in the UPDATE-tuple-routing case, the RTE is the one for
the given partition, not for the partitioned table, so the statement
would not be true. In the latter case, we don't need to create a child
RTE and replace the original RTE with it, but I handled both cases the
same way for simplicity.

Thanks for the comments, Horiguchi-san!

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Konstantin Knizhnik 2018-04-17 07:44:07 Re: Built-in connection pooling
Previous Message Amit Langote 2018-04-17 07:10:14 Re: Oddity in tuple routing for foreign partitions