Re: UPDATE of partition key

From: Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Dilip Kumar <dilipbalaut(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: UPDATE of partition key
Date: 2017-06-19 05:33:01
Message-ID: CAJ3gD9f_cfasOF+W-NB-ksHwSfh+hkOsH7ykKhzqB8z2ZRRkeQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

When I tested partition-key-update on a partitioned table having no
child partitions, it crashed. This is because there is an
Assert(mtstate->mt_num_partitions > 0) for creating the
partition-to-root map, which fails if there are no partitions under
the partitioned table. Actually we should skp creating this map if
there are no partitions under the partitioned table on which UPDATE is
run. So the attached patch has this new change to fix it (and
appropriate additional test case added) :

--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -2006,15 +2006,14 @@ ExecInitModifyTable(ModifyTable *node, EState
*estate, int eflags)
* descriptor of a source partition does not match the root partition
* descriptor. In such case we need to convert tuples to the
root partition
* tuple descriptor, because the search for destination partition starts
- * from the root.
+ * from the root. Skip this setup if it's not a partition key
update or if
+ * there are no partitions below this partitioned table.
*/
- if (is_partitionkey_update)
+ if (is_partitionkey_update && mtstate->mt_num_partitions > 0)
{
TupleConversionMap **tup_conv_maps;
TupleDesc outdesc;

- Assert(mtstate->mt_num_partitions > 0);
-
mtstate->mt_resultrel_maps =
(TupleConversionMap **)
palloc0(sizeof(TupleConversionMap*) * nplans);

On 15 June 2017 at 23:06, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
> On 13 June 2017 at 15:40, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com> wrote:
>> While rebasing my patch for the below recent commit, I realized that a
>> similar issue exists for the uptate-tuple-routing patch as well :
>>
>> commit 78a030a441966d91bc7e932ef84da39c3ea7d970
>> Author: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
>> Date: Mon Jun 12 23:29:44 2017 -0400
>>
>> Fix confusion about number of subplans in partitioned INSERT setup.
>>
>> The above issue was about incorrectly using 'i' in
>> mtstate->mt_plans[i] during handling WITH CHECK OPTIONS in
>> ExecInitModifyTable(), where 'i' was actually meant to refer to the
>> positions in mtstate->mt_num_partitions. Actually for INSERT, there is
>> only a single plan element in mtstate->mt_plans[] array.
>>
>> Similarly, for update-tuple routing, we cannot use
>> mtstate->mt_plans[i], because 'i' refers to position in
>> mtstate->mt_partitions[] , whereas mtstate->mt_plans is not at all in
>> order of mtstate->mt_partitions; in fact mt_plans has only the plans
>> that are to be scanned on pruned partitions; so it can well be a small
>> subset of total partitions.
>>
>> I am working on an updated patch to fix the above.
>
> Attached patch v10 fixes the above. In the existing code, where it
> builds WCO constraints for each leaf partition; with the patch, that
> code now is applicable to row-movement-updates as well. So the
> assertions in the code are now updated to allow the same. Secondly,
> the mapping for each of the leaf partitions was constructed using the
> root partition attributes. Now in the patch, the
> mtstate->resultRelInfo[0] (i.e. the first resultRelInfo) is used as
> reference. So effectively, map_partition_varattnos() now represents
> not just parent-to-partition mapping, but rather, mapping between any
> two partitions/partitioned_tables. It's done this way, so that we can
> have a common WCO building code for inserts as well as updates. For
> e.g. for inserts, the first (and only) WCO belongs to
> node->nominalRelation so nominalRelation is used for
> map_partition_varattnos(), whereas for updates, first WCO belongs to
> the first resultRelInfo which is not same as nominalRelation. So in
> the patch, in both cases, we use the first resultRelInfo and the WCO
> of the first resultRelInfo for map_partition_varattnos().
>
> Similar thing is done for Returning expressions.
>
> ---------
>
> Another change in the patch is : for ExecInitQual() for WCO quals,
> mtstate->ps is used as parent, rather than first plan. For updates,
> first plan does not belong to the parent partition. In fact, I think
> in all cases, we should use mtstate->ps as the parent.
> mtstate->mt_plans[0] don't look like they should be considered parent
> of these expressions. May be it does not matter to which parent we
> link these quals, because there is no ReScan for ExecModifyTable().
>
> Note that for RETURNING projection expressions, we do use mtstate->ps.
>
> --------
>
> There is another issue I discovered. The row-movement works fine if
> the destination leaf partition has different attribute ordering than
> the root : the existing insert-tuple-routing mapping handles that. But
> if the source partition has different ordering w.r.t. the root, it has
> a problem : there is no mapping in the opposite direction, i.e. from
> the leaf to root. And we require that because the tuple of source leaf
> partition needs to be converted to root partition tuple descriptor,
> since ExecFindPartition() starts with root.
>
> To fix this, I have introduced another mapping array
> mtstate->mt_resultrel_maps[]. This corresponds to the
> mtstate->resultRelInfo[]. We don't require per-leaf-partition mapping,
> because the update result relations are pruned subset of the total
> leaf partitions.
>
> So in ExecInsert, before calling ExecFindPartition(), we need to
> convert the leaf partition tuple to root using this reverse mapping.
> Since we need to convert the tuple here, and again after
> ExecFindPartition() for the found leaf partition, I have replaced the
> common code by new function ConvertPartitionTupleSlot().
>
> -------
>
> Used a new flag is_partitionkey_update in ExecInitModifyTable(), which
> can be re-used in subsequent sections , rather than again calling
> IsPartitionKeyUpdate() function again.
>
> -------
>
> Some more test scenarios added that cover above changes. Basically
> partitions that have different tuple descriptors than parents.

--
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

Attachment Content-Type Size
update-partition-key_v11.patch application/octet-stream 50.1 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-06-19 06:07:39 Setting pd_lower in GIN metapage
Previous Message Ashutosh Sharma 2017-06-19 04:42:15 Re: Getting server crash on Windows when using ICU collation