Re: ON CONFLICT DO UPDATE for partitioned tables

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>, Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: ON CONFLICT DO UPDATE for partitioned tables
Date: 2018-03-20 11:53:23
Message-ID: 5AB0F633.2030007@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

(2018/03/20 13:30), Amit Langote wrote:
> On 2018/03/19 21:59, Etsuro Fujita wrote:
>> (2018/03/18 13:17), Alvaro Herrera wrote:
>>> Alvaro Herrera wrote:
>>> The only thing that I remain unhappy about this patch is the whole
>>> adjust_and_expand_partition_tlist() thing. I fear we may be doing
>>> redundant and/or misplaced work. I'll look into it next week.
>>
>> I'm still reviewing the patches, but I really agree on that point. As
>> Pavan mentioned upthread, the onConflictSet tlist for the root parent,
>> from which we create a translated onConflictSet tlist for a partition,
>> would have already been processed by expand_targetlist() to contain all
>> missing columns as well, so I think we could create the tlist for the
>> partition by simply re-ordering the expression-converted tlist (ie,
>> conv_setproj) based on the conversion map for the partition. The Attached
>> defines a function for that, which could be called, instead of calling
>> adjust_and_expand_partition_tlist(). This would allow us to get rid of
>> planner changes from the patches. Maybe I'm missing something, though.
>
> Thanks for the patch. I can confirm your proposed
> adjust_onconflictset_tlist() is enough to replace adjust_inherited_tlist()
> + expand_targetlist() combination (aka
> adjust_and_expand_partition_tlist()), thus rendering the planner changes
> in this patch unnecessary. I tested it with a partition tree involving
> partitions of varying attribute numbers (dropped columns included) and it
> seems to work as expected (as also exercised in regression tests) as shown
> below.

Thanks for testing!

> I have incorporated your patch in the main patch after updating the
> comments a bit. Also, now that 6666ee49f49 is in [1], the transition
> table related tests I proposed yesterday pass nicely. Instead of posting
> as a separate patch, I have merged it with the main patch. So now that
> planner refactoring is unnecessary, attached is just one patch.

Here are comments on executor changes in (the latest version of) the patch:

@@ -421,8 +424,18 @@ ExecInsert(ModifyTableState *mtstate,
ItemPointerData conflictTid;
bool specConflict;
List *arbiterIndexes;
+ PartitionTupleRouting *proute =
+ mtstate->mt_partition_tuple_routing;

- arbiterIndexes = node->arbiterIndexes;
+ /* Use the appropriate list of arbiter indexes. */
+ if (mtstate->mt_partition_tuple_routing != NULL)
+ {
+ Assert(partition_index >= 0 && proute != NULL);
+ arbiterIndexes =
+ proute->partition_arbiter_indexes[partition_index];
+ }
+ else
+ arbiterIndexes = node->arbiterIndexes;

To handle both cases the same way, I wonder if it would be better to
have the arbiterindexes list in ResultRelInfo as well, as mentioned by
Alvaro upthread, or to re-add mt_arbiterindexes as before and set it to
proute->partition_arbiter_indexes[partition_index] before we get here,
maybe in ExecPrepareTupleRouting, in the case of tuple routing.

ExecOnConflictUpdate(ModifyTableState *mtstate,
ResultRelInfo *resultRelInfo,
+ TupleDesc onConflictSetTupdesc,
ItemPointer conflictTid,
TupleTableSlot *planSlot,
TupleTableSlot *excludedSlot,
@@ -1419,6 +1459,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate,
ExecCheckHeapTupleVisible(estate, &tuple, buffer);

/* Store target's existing tuple in the state's dedicated slot */
+ ExecSetSlotDescriptor(mtstate->mt_existing, RelationGetDescr(relation));
ExecStoreTuple(&tuple, mtstate->mt_existing, buffer, false);

/*
@@ -1462,6 +1503,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate,
}

/* Project the new tuple version */
+ ExecSetSlotDescriptor(mtstate->mt_conflproj, onConflictSetTupdesc);
ExecProject(resultRelInfo->ri_onConflictSetProj);

Can we do ExecSetSlotDescriptor for mtstate->mt_existing and
mtstate->mt_conflproj in ExecPrepareTupleRouting in the case of tuple
routing? That would make the API changes to ExecOnConflictUpdate
unnecessary.

@@ -2368,9 +2419,13 @@ ExecInitModifyTable(ModifyTable *node, EState
*estate, int eflags)
econtext = mtstate->ps.ps_ExprContext;
relationDesc = resultRelInfo->ri_RelationDesc->rd_att;

- /* initialize slot for the existing tuple */
- mtstate->mt_existing =
- ExecInitExtraTupleSlot(mtstate->ps.state, relationDesc);
+ /*
+ * Initialize slot for the existing tuple. We determine which
+ * tupleDesc to use for this after we have determined which relation
+ * the insert/update will be applied to, possibly after performing
+ * tuple routing.
+ */
+ mtstate->mt_existing = ExecInitExtraTupleSlot(mtstate->ps.state, NULL);

/* carried forward solely for the benefit of explain */
mtstate->mt_excludedtlist = node->exclRelTlist;
@@ -2378,8 +2433,16 @@ ExecInitModifyTable(ModifyTable *node, EState
*estate, int eflags)
/* create target slot for UPDATE SET projection */
tupDesc = ExecTypeFromTL((List *) node->onConflictSet,
relationDesc->tdhasoid);
+ PinTupleDesc(tupDesc);
+ mtstate->mt_conflproj_tupdesc = tupDesc;
+
+ /*
+ * Just like the "existing tuple" slot, we'll defer deciding which
+ * tupleDesc to use for this slot to a point where tuple routing has
+ * been performed.
+ */
mtstate->mt_conflproj =
- ExecInitExtraTupleSlot(mtstate->ps.state, tupDesc);
+ ExecInitExtraTupleSlot(mtstate->ps.state, NULL);

If we do ExecInitExtraTupleSlot for mtstate->mt_existing and
mtstate->mt_conflproj in ExecPrepareTupleRouting in the case of tuple
routing, as said above, we wouldn't need this changes. I think doing
that only in the case of tuple routing and keeping this as-is would be
better because that would save cycles in the normal case.

I'll look at other parts of the patch next.

Best regards,
Etsuro Fujita

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2018-03-20 11:55:07 Re: [HACKERS] Add support for tuple routing to foreign partitions
Previous Message Dagfinn Ilmari =?utf-8?Q?Manns=C3=A5ker?= 2018-03-20 11:30:59 Re: configure's checks for --enable-tap-tests are insufficient