Re: partition routing layering in nodeModifyTable.c

From: Andres Freund <andres(at)anarazel(dot)de>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, amitdkhan(dot)pg(at)gmail(dot)com
Subject: Re: partition routing layering in nodeModifyTable.c
Date: 2019-08-02 18:01:38
Message-ID: 20190802180138.64zcircokw2upaho@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2019-07-31 17:04:38 +0900, Amit Langote wrote:
> I looked into trying to do the things I mentioned above and it seems
> to me that revising BeginDirectModify()'s API to receive the
> ResultRelInfo directly as Andres suggested might be the best way
> forward. I've implemented that in the attached 0001. Patches that
> were previously 0001, 0002, and 0003 are now 0002, 003, and 0004,
> respectively. 0002 is now a patch to "remove"
> es_result_relation_info.

Thanks! Some minor quibbles aside, the non FDW patches look good to me.

Fujita-san, do you have any comments on the FDW API change? Or anybody
else?

I'm a bit woried about the move of BeginDirectModify() into
nodeModifyTable.c - it just seems like an odd control flow to me. Not
allowing any intermittent nodes between ForeignScan and ModifyTable also
seems like an undesirable restriction for the future. I realize that we
already do that for BeginForeignModify() (just btw, that already accepts
resultRelInfo as a parameter, so being symmetrical for BeginDirectModify
makes sense), but it still seems like the wrong direction to me.

The need for that move, I assume, comes from needing knowing the correct
ResultRelInfo, correct? I wonder if we shouldn't instead determine the
at plan time (in setrefs.c), somewhat similar to how we determine
ModifyTable.resultRelIndex. Doesn't look like that'd be too hard?

Then we could just have BeginForeignModify, BeginDirectModify,
BeginForeignScan all be called from ExecInitForeignScan().

Path 04 is such a nice improvement. Besides getting rid of a substantial
amount of code, it also makes the control flow a lot easier to read.

> @@ -4644,9 +4645,7 @@ GetAfterTriggersTableData(Oid relid, CmdType cmdType)
> * If there are no triggers in 'trigdesc' that request relevant transition
> * tables, then return NULL.
> *
> - * The resulting object can be passed to the ExecAR* functions. The caller
> - * should set tcs_map or tcs_original_insert_tuple as appropriate when dealing
> - * with child tables.
> + * The resulting object can be passed to the ExecAR* functions.
> *
> * Note that we copy the flags from a parent table into this struct (rather
> * than subsequently using the relation's TriggerDesc directly) so that we can
> @@ -5750,14 +5749,26 @@ AfterTriggerSaveEvent(EState *estate, ResultRelInfo *relinfo,
> */
> if (row_trigger && transition_capture != NULL)
> {
> - TupleTableSlot *original_insert_tuple = transition_capture->tcs_original_insert_tuple;
> - TupleConversionMap *map = transition_capture->tcs_map;
> + TupleTableSlot *original_insert_tuple;
> + PartitionRoutingInfo *pinfo = relinfo->ri_PartitionInfo;
> + TupleConversionMap *map = pinfo ?
> + pinfo->pi_PartitionToRootMap :
> + relinfo->ri_ChildToRootMap;
> bool delete_old_table = transition_capture->tcs_delete_old_table;
> bool update_old_table = transition_capture->tcs_update_old_table;
> bool update_new_table = transition_capture->tcs_update_new_table;
> bool insert_new_table = transition_capture->tcs_insert_new_table;
>
> /*
> + * Get the originally inserted tuple from the global variable and set
> + * the latter to NULL because any given tuple must be read only once.
> + * Note that the TransitionCaptureState is shared across many calls
> + * to this function.
> + */
> + original_insert_tuple = transition_capture->tcs_original_insert_tuple;
> + transition_capture->tcs_original_insert_tuple = NULL;

Maybe I'm missing something, but original_insert_tuple is not a global
variable?

> @@ -888,7 +889,8 @@ ExecInitRoutingInfo(ModifyTableState *mtstate,
> PartitionTupleRouting *proute,
> PartitionDispatch dispatch,
> ResultRelInfo *partRelInfo,
> - int partidx)
> + int partidx,
> + bool is_update_result_rel)
> {
> MemoryContext oldcxt;
> PartitionRoutingInfo *partrouteinfo;
> @@ -935,10 +937,15 @@ ExecInitRoutingInfo(ModifyTableState *mtstate,
> if (mtstate &&
> (mtstate->mt_transition_capture || mtstate->mt_oc_transition_capture))
> {
> - partrouteinfo->pi_PartitionToRootMap =
> - convert_tuples_by_name(RelationGetDescr(partRelInfo->ri_RelationDesc),
> - RelationGetDescr(partRelInfo->ri_PartitionRoot),
> - gettext_noop("could not convert row type"));
> + /* If partition is an update target, then we already got the map. */
> + if (is_update_result_rel)
> + partrouteinfo->pi_PartitionToRootMap =
> + partRelInfo->ri_ChildToRootMap;
> + else
> + partrouteinfo->pi_PartitionToRootMap =
> + convert_tuples_by_name(RelationGetDescr(partRelInfo->ri_RelationDesc),
> + RelationGetDescr(partRelInfo->ri_PartitionRoot),
> + gettext_noop("could not convert row type"));
> }

Hm, isn't is_update_result_rel just ModifyTable->operation == CMD_UPDATE?

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2019-08-02 18:12:34 Re: The unused_oids script should have a reminder to use the 8000-8999 OID range
Previous Message Tomas Vondra 2019-08-02 17:52:39 Re: pglz performance