Re: Add support for tuple routing to foreign partitions

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Maksim Milyutin <milyutinma(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Subject: Re: Add support for tuple routing to foreign partitions
Date: 2017-10-26 07:40:37
Message-ID: 8104b875-d5c1-b881-6ec3-5ff956ae10da@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Maksim,

On 2017/10/02 21:37, Maksim Milyutin wrote:
> On 11.09.2017 16:01, Etsuro Fujita wrote:
>> * Query planning: the patch creates copies of Query/Plan with a
>> foreign partition as target from the original Query/Plan for each
>> foreign partition and invokes PlanForeignModify with those copies, to
>> allow the FDW to do query planning for remote INSERT with the existing
>> API.  To make such Queries the similar way inheritance_planner does, I
>> modified transformInsertStmt so that the inh flag for the partitioned
>> table's RTE is set to true, which allows (1) expand_inherited_rtentry
>> to build an RTE and AppendRelInfo for each partition in the
>> partitioned table and (2) make_modifytable to build such Queries using
>> adjust_appendrel_attrs and those AppendRelInfos.
>>
>> * explain.c: I modified show_modifytable_info so that we can show
>> remote queries for foreign partitions in EXPLAIN for INSERT into a
>> partitioned table the same way as for inherited UPDATE/DELETE cases.
>> Here is an example:
>>
>> postgres=# explain verbose insert into pt values (1), (2);
>>                             QUERY PLAN
>> -------------------------------------------------------------------
>>  Insert on public.pt  (cost=0.00..0.03 rows=2 width=4)
>>    Foreign Insert on public.fp1
>>      Remote SQL: INSERT INTO public.t1(a) VALUES ($1)
>>    Foreign Insert on public.fp2
>>      Remote SQL: INSERT INTO public.t2(a) VALUES ($1)
>>    ->  Values Scan on "*VALUES*"  (cost=0.00..0.03 rows=2 width=4)
>>          Output: "*VALUES*".column1
>> (7 rows)
>>
>> I think I should add more explanation about the patch, but I don't
>> have time today, so I'll write additional explanation in the next
>> email. Sorry about that.
>
> Could you update your patch, it isn't applied on the actual state of
> master. Namely conflict in src/backend/executor/execMain.c

Attached is an updated version.

* As mentioned in "Query planning", the patch builds an RTE for each
partition so that the FDW can make reference to that RTE in eg,
PlanForeignModify. set_plan_references also uses such RTEs to record
plan dependencies for plancache.c to invalidate cached plans. See an
example for that added to the regression tests in postgres_fdw.

* As mentioned in "explain.c", the EXPLAIN shows all partitions beneath
the ModifyTable node. One merit of that is we can show remote queries
for foreign partitions in the output as shown above. Another one I can
think of is when reporting execution stats for triggers. Here is an
example for that:

postgres=# explain analyze insert into list_parted values (1, 'hi
there'), (2, 'hi there');
QUERY PLAN
--------------------------------------------------------------------------------------------------------------
Insert on list_parted (cost=0.00..0.03 rows=2 width=36) (actual
time=0.375..0.375 rows=0 loops=1)
Insert on part1
Insert on part2
-> Values Scan on "*VALUES*" (cost=0.00..0.03 rows=2 width=36)
(actual time=0.004..0.007 rows=2 loops=1)
Planning time: 0.089 ms
Trigger part1brtrig on part1: time=0.059 calls=1
Trigger part2brtrig on part2: time=0.021 calls=1
Execution time: 0.422 ms
(8 rows)

This would allow the user to understand easily that "part1" and "part2"
in the trigger lines are the partitions of list_parted. So, I think
it's useful to show partition info in the ModifyTable node even in the
case where a partitioned table only contains plain tables.

* The patch modifies make_modifytable and ExecInitModifyTable so that
the former can allow the FDW to construct private plan data for each
foreign partition and accumulate it all into a list, and that the latter
can perform BeginForeignModify for each partition using that plan data
stored in the list passed from make_modifytable. Other changes I made
to the executor are: (1) currently, we set the RT index for the root
partitioned table to ri_RangeTableIndex of partitions' ResultRelInfos,
but the proposed EXPLAIN requires that the partition's
ri_RangeTableIndex is set to the RT index for that partition's RTE, to
show that partition info in the output. So, I made that change.
Because of that, ExecPartitionCheck, ExecConstraints, and
ExecWithCheckOptions are adjusted accordingly. (2) I added a new member
to ResultRelInfo (ie, ri_PartitionIsValid), and modified
CheckValidResultRel so that it checks a given partition without throwing
an error and save the result in that flag so that ExecInsert determines
using that flag whether a partition chosen by ExecFindPartition is valid
for tuple-routing as proposed before.

* copy.c: I still don't think it's a good idea to implement COPY
tuple-routing for foreign partitions using PlanForeignModify. (I plan
to propose new FDW APIs for bulkload as discussed before, to implement
this feature.) So, I kept that as-is. Two things I changed there are:
(1) currently, ExecSetupPartitionTupleRouting verifies partitions using
CheckValidResultRel, but I don't think we need the CheckValidResultRel
check in the COPY case. So, I refactored that function and checked
partitions directly. (2) I think it'd be better to distinguish the
error message "cannot route inserted tuples to a foreign partition" in
the COPY case from the INSERT case, I changed it to "cannot route copied
tuples to a foreign partition".

* Fixed some bugs, revised comments, added a bit more regression tests,
and rebased the patch.

Comments are welcome!

My apologies for the very late reply.

Best regards,
Etsuro Fujita

Attachment Content-Type Size
tuple-routing-to-foreign-partitions-v3.patch text/plain 60.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message 高增琦 2017-10-26 08:03:50 Try to fix endless loop in ecpg with informix mode
Previous Message Pavel Stehule 2017-10-26 07:21:24 proposal: schema variables