From: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Add support for tuple routing to foreign partitions |
Date: | 2017-07-06 10:50:33 |
Message-ID: | 02d3ec78-b58b-7376-0267-d1fe7f2da594@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017/07/05 9:13, Amit Langote wrote:
> On 2017/06/29 20:20, Etsuro Fujita wrote:
>> In relation to this, I also allowed expand_inherited_rtentry() to
>> build an RTE and AppendRelInfo for each partition of a partitioned table
>> that is an INSERT target, as mentioned by Amit in [1], by modifying
>> transformInsertStmt() so that we set the inh flag for the target table's
>> RTE to true when the target table is a partitioned table.
>
> About this part, Robert sounded a little unsure back then [1]; his words:
>
> "Doing more inheritance expansion early is probably not a good idea
> because it likely sucks for performance, and that's especially unfortunate
> if it's only needed for foreign tables."
>
> But...
>
>> The partition
>> RTEs are not only referenced by FDWs, but used in selecting relation
>> aliases for EXPLAIN (see below) as well as extracting plan dependencies in
>> setref.c so that we force rebuilding of the plan on any change to
>> partitions. (We do replanning on FDW table-option changes to foreign
>> partitions, currently. See regression tests in the attached patch.)
>
> ...this part means we cannot really avoid locking at least the foreign
> partitions during the planning stage, which we currently don't, as we
> don't look at partitions at all before ExecSetupPartitionTupleRouting() is
> called by ExecInitModifyTable().
Another case where we need inheritance expansion during the planning
stage would probably be INSERT .. ON CONFLICT into a partitioned table,
I guess. We don't support that yet, but if implementing that, I guess
we would probably need to look at each partition and do something like
infer_arbiter_indexes() for each partition during the planner stage.
Not sure.
> Since we are locking *all* the partitions anyway, it may be better to
> shift the cost of locking to the planner by doing the inheritance
> expansion even in the insert case (for partitioned tables) and avoid
> calling the lock manager again in the executor when setting up
> tuple-routing data structures (that is, in ExecSetupPartitionTupleRouting).
We need the execution-time lock, anyway. See the comments from Robert
in [3].
> An idea that Robert recently mentioned on the nearby "UPDATE partition
> key" thread [2] seems relevant in this regard. By that idea,
> expand_inherited_rtentry(), instead of reading in the partition OIDs in
> the old catalog order, will instead call
> RelationBuildPartitionDispatchInfo(), which locks the partitions and
> returns their OIDs in the canonical order. If we store the foreign
> partition RT indexes in that order in ModifyTable.partition_rels
> introduced by this patch, then the following code added by the patch could
> be made more efficient (as also explained in aforementioned Robert's email):
>
> + foreach(l, node->partition_rels)
> + {
> + Index rti = lfirst_int(l);
> + Oid relid = getrelid(rti, estate->es_range_table);
> + bool found;
> + int j;
> +
> + /* First, find the ResultRelInfo for the partition */
> + found = false;
> + for (j = 0; j < mtstate->mt_num_partitions; j++)
> + {
> + resultRelInfo = partitions + j;
> +
> + if (RelationGetRelid(resultRelInfo->ri_RelationDesc) ==
> relid)
> + {
> + Assert(mtstate->mt_partition_indexes[i] == 0);
> + mtstate->mt_partition_indexes[i] = j;
> + found = true;
> + break;
> + }
> + }
> + if (!found)
> + elog(ERROR, "failed to find ResultRelInfo for rangetable
> index %u", rti);
Agreed. I really want to improve this based on that idea.
Thank you for the comments!
Best regards,
Etsuro Fujita
From | Date | Subject | |
---|---|---|---|
Next Message | Amit Kapila | 2017-07-06 11:29:59 | Re: hash index on unlogged tables doesn't behave as expected |
Previous Message | Kuntal Ghosh | 2017-07-06 10:48:50 | Re: Error while copying a large file in pg_rewind |