Re: non-bulk inserts and tuple routing

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: non-bulk inserts and tuple routing
Date: 2018-02-09 12:20:03
Message-ID: 5A7D91F3.9050309@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2018/02/09 14:32), Amit Langote wrote:
> I had mistakenly tagged these patches v24, but they were actually supposed
> to be v5. So the attached updated patch is tagged v6.

OK.

> On 2018/02/07 19:36, Etsuro Fujita wrote:
>>> (2018/02/05 14:34), Amit Langote wrote:
>>>> The code in tupconv_map_for_subplan() currently assumes that it can rely
>>>> on all leaf partitions having been initialized.
>>
>> On reflection I noticed this analysis is not 100% correct; I think what
>> that function actually assumes is that all sublplans' partitions have
>> already been initialized, not all leaf partitions.
>
> Yes, you're right.
>
>>>> Since we're breaking that
>>>> assumption with this proposal, that needed to be changed. So the patch
>>>> contained some refactoring to make it not rely on that assumption.
>>
>> I don't think we really need this refactoring because since that as in
>> another patch you posted, we could initialize all subplans' partitions in
>> ExecSetupPartitionTupleRouting, I think tupconv_map_for_subplan could be
>> called without any changes to that function because of what I said above.
>
> What my previous approach failed to consider is that in the update case,
> we'd already have ResultRelInfo's for some of the leaf partitions
> initialized, which could be saved into proute->partitions array right away.
>
> Updated patch does things that way, so all the changes I had proposed to
> tupconv_map_for_subplan are rendered unnecessary.

OK, thanks for the updated patch!

>> Here are comments for the other patch (patch
>> v24-0002-During-tuple-routing-initialize-per-partition-ob.patch):
>>
>> o On changes to ExecSetupPartitionTupleRouting:
>>
>> * The comment below wouldn't be correct; no ExecInitResultRelInfo in the
>> patch.
>>
>> - proute->partitions = (ResultRelInfo **) palloc(proute->num_partitions *
>> - sizeof(ResultRelInfo *));
>> + /*
>> + * Actual ResultRelInfo's and TupleConversionMap's are allocated in
>> + * ExecInitResultRelInfo().
>> + */
>> + proute->partitions = (ResultRelInfo **) palloc0(proute->num_partitions *
>> + sizeof(ResultRelInfo
> *));
>
> I removed the comment altogether, as the comments elsewhere make the point
> clear.
>
>> * The patch removes this from the initialization step for a subplan's
>> partition, but I think it would be better to keep this here because I
>> think it's a good thing to put the initialization stuff together into one
>> place.
>>
>> - /*
>> - * This is required in order to we convert the partition's
>> - * tuple to be compatible with the root partitioned table's
>> - * tuple descriptor. When generating the per-subplan result
>> - * rels, this was not set.
>> - */
>> - leaf_part_rri->ri_PartitionRoot = rel;
>
> It wasn't needed here with the previous approach, because we didn't touch
> any ResultRelInfo's in ExecSetupPartitionTupleRouting, but I've added it
> back in the updated patch.
>
>> * I think it would be better to keep this comment here.
>>
>> - /* Remember the subplan offset for this ResultRelInfo */
>
> Fixed.
>
>> * Why is this removed from that initialization?
>>
>> - proute->partitions[i] = leaf_part_rri;
>
> Because of the old approach. Now it's back in.
>
>> o On changes to ExecInitPartitionInfo:
>>
>> * I don't understand the step starting from this, but I'm wondering if
>> that step can be removed by keeping the above setup of proute->partitions
>> for the subplan's partition in ExecSetupPartitionTupleRouting.
>>
>> + /*
>> + * If we are doing tuple routing for update, try to reuse the
>> + * per-subplan resultrel for this partition that ExecInitModifyTable()
>> + * might already have created.
>> + */
>> + if (mtstate&& mtstate->operation == CMD_UPDATE)
>
> Done, as mentioned above.
>
> On 2018/02/08 19:16, Etsuro Fujita wrote:
>> Here are some minor comments:
>>
>> o On changes to ExecInsert
>>
>> * This might be just my taste, but I think it would be better to (1)
>> change ExecInitPartitionInfo so that it creates and returns a
>> newly-initialized ResultRelInfo for an initialized partition, and (2)
>> rewrite this bit:
>>
>> + /* Initialize partition info, if not done already. */
>> + ExecInitPartitionInfo(mtstate, resultRelInfo, proute, estate,
>> + leaf_part_index);
>> +
>> /*
>> * Save the old ResultRelInfo and switch to the one corresponding to
>> * the selected partition.
>> */
>> saved_resultRelInfo = resultRelInfo;
>> resultRelInfo = proute->partitions[leaf_part_index];
>> + Assert(resultRelInfo != NULL);
>>
>> to something like this (I would say the same thing to the copy.c changes):
>>
>> /*
>> * Save the old ResultRelInfo and switch to the one corresponding to
>> * the selected partition.
>> */
>> saved_resultRelInfo = resultRelInfo;
>> resultRelInfo = proute->partitions[leaf_part_index];
>> if (resultRelInfo == NULL);
>> {
>> /* Initialize partition info. */
>> resultRelInfo = ExecInitPartitionInfo(mtstate,
>> saved_resultRelInfo,
>> proute,
>> estate,
>> leaf_part_index);
>> }
>>
>> This would make ExecInitPartitionInfo more simple because it can assume
>> that the given partition has not been initialized yet.
>
> Agree that it's much better to do it this way. Done.

Thanks for all those changes!

>> o On changes to execPartition.h
>>
>> * Please add a brief decsription about partition_oids to the comments for
>> this struct.
>>
>> @@ -91,6 +91,7 @@ typedef struct PartitionTupleRouting
>> {
>> PartitionDispatch *partition_dispatch_info;
>> int num_dispatch;
>> + Oid *partition_oids;
>
> Done.

Thanks, but one thing I'm wondering is: do we really need this array? I
think we could store into PartitionTupleRouting the list of oids
returned by RelationGetPartitionDispatchInfo in
ExecSetupPartitionTupleRouting, instead. Sorry, I should have commented
this in a previous email, but what do you think about that?

Here are other comments:

o On changes to ExecSetupPartitionTupleRouting:

* This is nitpicking, but it would be better to define partrel and
part_tupdesc within the if (update_rri_index < num_update_rri &&
RelationGetRelid(update_rri[update_rri_index].ri_RelationDesc) ==
leaf_oid) block.

- ResultRelInfo *leaf_part_rri;
+ ResultRelInfo *leaf_part_rri = NULL;
Relation partrel = NULL;
TupleDesc part_tupdesc;
Oid leaf_oid = lfirst_oid(cell);

* Do we need this? For a leaf partition that is already present in the
subplan resultrels, the partition's indices (if any) would have already
been opened.

+ /*
+ * Open partition indices. We wouldn't
need speculative
+ * insertions though.
+ */
+ if
(leaf_part_rri->ri_RelationDesc->rd_rel->relhasindex &&
+
leaf_part_rri->ri_IndexRelationDescs == NULL)
+ ExecOpenIndices(leaf_part_rri,
false);

I'll look at the patch a bit more early next week, but other than that,
the patch looks fairly in good shape to me.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Claudio Freire 2018-02-09 12:20:32 Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem
Previous Message Ildar Musin 2018-02-09 11:49:29 Re: Proposal: partition pruning by secondary attributes