Re: ON CONFLICT DO UPDATE for partitioned tables

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andres Freund <andres(at)anarazel(dot)de>
Cc: Pavan Deolasee <pavan(dot)deolasee(at)gmail(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Peter Geoghegan <pg(at)bowt(dot)ie>
Subject: Re: ON CONFLICT DO UPDATE for partitioned tables
Date: 2018-03-19 07:45:11
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks for the updated patches.

On 2018/03/18 13:17, Alvaro Herrera wrote:
> Alvaro Herrera wrote:
>> I think what I should be doing is the same as the returning stuff: keep
>> a tupdesc around, and use a single slot, whose descriptor is changed
>> just before the projection.
> Yes, this works, though it's ugly. Not any uglier than what's already
> there, though, so I think it's okay.
> The only thing that I remain unhappy about this patch is the whole
> adjust_and_expand_partition_tlist() thing. I fear we may be doing
> redundant and/or misplaced work. I'll look into it next week.
> 0001 should be pretty much ready to push -- adjustments to ExecInsert
> and ModifyTableState I already mentioned.

This seems like good cleanup.

While at it, why not also get rid of mt_onconflict in favor of always just
using its counterpart in ModifyTable -- onConflictAction?

> 0002 is stuff I would like to get rid of completely -- changes to
> planner code so that it better supports functionality we need for
> 0003.

Hmm. I'm not sure if we can completely get rid of this, because we do
need the adjust_inherited_tlist() facility to translate TargetEntry resnos
in any case.

But as I just said in reply to Pavan's email suggesting deferring
onConflistSet expansion to execution time, we don't need the hack in
adjust_inherited_tlist() if we go with the suggestion.

> 0003 is the main patch. Compared to the previous version, this one
> reuses slots by switching them to different tupdescs as needed.

Your proposed change to use just one slot (the existing mt_conflproj slot)
sounds good. Instead, it seems now we have an array to hold tupleDescs
for the onConflistSet target lists for each partition.

Some comments:

1. I noticed a bug that crashes a test in insert_conflit.sql that uses DO
NOTHING instead of DO UPDATE SET. It's illegal for ExecInitPartitionInfo
to expect mt_conflproj_tupdesc to be valid in the DO NOTHING case, because
ExecInitModifyTable would only set it if (onConflictAction == DO_UPDATE).

2. It seems better to name the new array field in PartitionTupleRouting
partition_conflproj_tupdescs rather than partition_onconfl_tupdescs to be
consistent with the new field in ModifyTableState.

3. I think it was an oversight in my original patch, but it seems we
should allocate the partition_onconfl_tdescs array only if DO UPDATE
action is used. Also, ri_onConflictSetProj, ri_onConflictSetWhere should
be only set in that case. OTOH, we always need to set
partition_arbiter_indexes, that is, for both DO NOTHING and DO UPDATE SET

4. Need to remove the comments for partition_conflproj_slots and
partition_existing_slots, fields of PartitionTupleRouting that no longer
exist. Instead one for partition_conflproj_tupdescs should be added.

5. I know the following is so as not to break the Assert in
adjust_inherited_tlist(), so why not have a parentOid argument for

+ appinfo.parent_reloid = 1; // dummy parentRel->rd_id;

6. There is a sentence in the comment above adjust_inherited_tlist():

Note that this is not needed for INSERT because INSERT isn't

Maybe, we need to delete that and mention that we do need it in the case
of INSERT ON CONFLICT DO UPDATE on partitioned tables for translating DO
UPDATE SET target list.

7. In ExecInsert, it'd be better to have a partArbiterIndexes, just like
partConflTupdesc in the outermost scope and then do:

+ /* Use the appropriate list of arbiter indexes. */
+ if (mtstate->mt_partition_tuple_routing != NULL)
+ arbiterIndexes = partArbiterIndexes;
+ else
+ arbiterIndexes = node->arbiterIndexes;


+ /* Use the appropriate tuple descriptor. */
+ if (mtstate->mt_partition_tuple_routing != NULL)
+ onconfl_tupdesc = partConflTupdesc;
+ else
+ onconfl_tupdesc = mtstate->mt_conflproj_tupdesc;

using arbiterIndexes and onconfl_tupdesc declared in the appropriate scopes.

I have tried to make these changes and attached are the updated patches
containing those, including the change I suggested for 0001 (that is,
getting rid of mt_onconflict). I also expanded some comments in 0003
while making those changes.


Attachment Content-Type Size
v5-0001-Simplify-ExecInsert-API-re.-ON-CONFLICT-data.patch text/plain 7.7 KB
v5-0002-Make-some-static-functions-work-on-TupleDesc-rath.patch text/plain 7.2 KB
v5-0003-Fix-ON-CONFLICT-to-work-with-partitioned-tables.patch text/plain 39.1 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2018-03-19 08:09:48 Re: [HACKERS] Restricting maximum keep segments by repslots
Previous Message Rushabh Lathia 2018-03-19 07:25:38 Re: INOUT parameters in procedures