Re: Add support for tuple routing to foreign partitions

From: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Add support for tuple routing to foreign partitions
Date: 2017-07-06 10:50:33
Message-ID: 02d3ec78-b58b-7376-0267-d1fe7f2da594@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2017/07/05 9:13, Amit Langote wrote:
> On 2017/06/29 20:20, Etsuro Fujita wrote:

>> In relation to this, I also allowed expand_inherited_rtentry() to
>> build an RTE and AppendRelInfo for each partition of a partitioned table
>> that is an INSERT target, as mentioned by Amit in [1], by modifying
>> transformInsertStmt() so that we set the inh flag for the target table's
>> RTE to true when the target table is a partitioned table.
>
> About this part, Robert sounded a little unsure back then [1]; his words:
>
> "Doing more inheritance expansion early is probably not a good idea
> because it likely sucks for performance, and that's especially unfortunate
> if it's only needed for foreign tables."
>
> But...
>
>> The partition
>> RTEs are not only referenced by FDWs, but used in selecting relation
>> aliases for EXPLAIN (see below) as well as extracting plan dependencies in
>> setref.c so that we force rebuilding of the plan on any change to
>> partitions. (We do replanning on FDW table-option changes to foreign
>> partitions, currently. See regression tests in the attached patch.)
>
> ...this part means we cannot really avoid locking at least the foreign
> partitions during the planning stage, which we currently don't, as we
> don't look at partitions at all before ExecSetupPartitionTupleRouting() is
> called by ExecInitModifyTable().

Another case where we need inheritance expansion during the planning
stage would probably be INSERT .. ON CONFLICT into a partitioned table,
I guess. We don't support that yet, but if implementing that, I guess
we would probably need to look at each partition and do something like
infer_arbiter_indexes() for each partition during the planner stage.
Not sure.

> Since we are locking *all* the partitions anyway, it may be better to
> shift the cost of locking to the planner by doing the inheritance
> expansion even in the insert case (for partitioned tables) and avoid
> calling the lock manager again in the executor when setting up
> tuple-routing data structures (that is, in ExecSetupPartitionTupleRouting).

We need the execution-time lock, anyway. See the comments from Robert
in [3].

> An idea that Robert recently mentioned on the nearby "UPDATE partition
> key" thread [2] seems relevant in this regard. By that idea,
> expand_inherited_rtentry(), instead of reading in the partition OIDs in
> the old catalog order, will instead call
> RelationBuildPartitionDispatchInfo(), which locks the partitions and
> returns their OIDs in the canonical order. If we store the foreign
> partition RT indexes in that order in ModifyTable.partition_rels
> introduced by this patch, then the following code added by the patch could
> be made more efficient (as also explained in aforementioned Robert's email):
>
> + foreach(l, node->partition_rels)
> + {
> + Index rti = lfirst_int(l);
> + Oid relid = getrelid(rti, estate->es_range_table);
> + bool found;
> + int j;
> +
> + /* First, find the ResultRelInfo for the partition */
> + found = false;
> + for (j = 0; j < mtstate->mt_num_partitions; j++)
> + {
> + resultRelInfo = partitions + j;
> +
> + if (RelationGetRelid(resultRelInfo->ri_RelationDesc) ==
> relid)
> + {
> + Assert(mtstate->mt_partition_indexes[i] == 0);
> + mtstate->mt_partition_indexes[i] = j;
> + found = true;
> + break;
> + }
> + }
> + if (!found)
> + elog(ERROR, "failed to find ResultRelInfo for rangetable
> index %u", rti);

Agreed. I really want to improve this based on that idea.

Thank you for the comments!

Best regards,
Etsuro Fujita

[3]
https://www.postgresql.org/message-id/CA+TgmoYiwviCDRi3Zk+QuXj1r7uMu9T_kDNq+17PCWgzrbzw8A@mail.gmail.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2017-07-06 11:29:59 Re: hash index on unlogged tables doesn't behave as expected
Previous Message Kuntal Ghosh 2017-07-06 10:48:50 Re: Error while copying a large file in pg_rewind