From: | Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> |
---|---|
To: | Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com> |
Cc: | Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Robert Haas <robertmhaas(at)gmail(dot)com>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Partition-wise join for join between (declaratively) partitioned tables |
Date: | 2017-09-05 07:46:15 |
Message-ID: | 34e99e97-f099-12f1-73f7-657366d74606@lab.ntt.co.jp |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2017/09/05 15:43, Ashutosh Bapat wrote:
> Ok. Can you please answer my previous questions?
>
> AFAIU, the list contained RTIs of the relations, which didnt' have
> corresponding AppendRelInfos to lock those relations. Now that we
> create AppendRelInfos even for partitioned partitions with my 0001
> patch, I don't think
> we need the list to take care of the locks. Is there any other reason
> why we maintain that list (apart from the trigger case I have raised
> and Fujita-san says that the list is not required in that case as
> well.)?
AppendRelInfos exist within the planner (they come to be and go away
within the planner). Once we leave the planner, that information is gone.
Executor will receive a plan node that won't contain that information:
1. Append has an appendplans field, which contains one plan tree for every
leaf partition. None of its fields, other than partitined_rels,
contains the RT indexes of the partitioned tables. Similarly in the
case of MergeAppend.
2. ModifyTable has a resultRelations fields which contains a list of leaf
partition RT indexes and a plans field which contains one plan tree for
every RT index in the resultRelations list (that is a plan tree that
will scan the particular leaf partition). None of its fields, other
than partitined_rels, contains the RT indexes of the partitioned
tables.
I learned over the course of developing the patch that added this
partitioned_rels field [1] that the executor needs to identify all the
affected tables by a given plan tree so that it could lock them. Executor
needs to lock them separately even if the plan itself was built after
locking all the relevant tables [2]. For example, see
ExecLockNonLeafAppendTables(), which will lock the tables in the
(Merge)Append.partitioned_rels list.
While I've been thinking all along that the same thing must be happening
for RT indexes in ModifyTable.partitioned_rels list (I said so a couple of
times on this thread), it's actually not. Instead,
ModifyTable.partitioned_rels of all ModifyTable nodes in a give query are
merged into PlannedStmt.nonleafResultRelations (which happens in
set_plan_refs) and that's where the executor finds them to lock them
(which happens in InitPlan).
So, it appears that ModifyTable.partitioned_rels is indeed unused in the
executor. But we still can't get rid of it from the ModifyTable node
itself without figuring out a way (a channel) to transfer that information
into PlannedStmt.nonleafResultRelations.
> Having asked that, I think my patch shouldn't deal with removing
> partitioned_rels lists and related structures and code. It should be> done as a separate patch.
Going back to your original email which started this discussion, it seems
that we agree on that the PartitionedChildRelInfo node can be removed, and
I agree that it shouldn't be done in the partitionwise-join patch series
but as a separate patch. As described above, we shouldn't try yet to get
rid of the partitioned_rels list that appears in some plan nodes.
Thanks,
Amit
[1]
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=d3cc37f1d8
From | Date | Subject | |
---|---|---|---|
Next Message | Konstantin Knizhnik | 2017-09-05 08:10:38 | Re: Secondary index access optimizations |
Previous Message | Surafel Temesgen | 2017-09-05 07:45:55 | Re: Support to COMMENT ON DATABASE CURRENT_DATABASE |