Re: Partition-wise join for join between (declaratively) partitioned tables

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-09-12 08:53:31
Message-ID: CAFjFpRfWu3pVXv5DU0VgSh9JMwwzzWbzxXiPnpxVwVqRn82wyg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Sep 12, 2017 at 1:42 PM, Amit Langote
<Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> On 2017/09/12 16:55, Ashutosh Bapat wrote:
>> On Tue, Sep 12, 2017 at 1:16 PM, Amit Langote wrote:
>>> So I looked at this a bit closely and came to the conclusion that we may
>>> not need to keep partitioned table RT indexes in the
>>> (Merge)Append.partitioned_rels after all, as far as execution-time locking
>>> is concerned.
>>>
>>> Consider two cases:
>>>
>>> 1. Plan is created and executed in the same transaction
>>>
>>> In this case, locks taken on the partitioned tables by the planner will
>>> suffice.
>>>
>>> 2. Plan is executed in a different transaction from the one in which it
>>> was created (a cached plan)
>>>
>>> In this case, AcquireExecutorLocks will lock all the relations in
>>> PlannedStmt.rtable, which must include all partitioned tables of all
>>> partition trees involved in the query. Of those, it will lock the tables
>>> whose RT indexes appear in PlannedStmt.nonleafResultRelations with
>>> RowExclusiveLock mode. PlannedStmt.nonleafResultRelations is a global
>>> list of all partitioned table RT indexes obtained by concatenating
>>> partitioned_rels lists of all ModifyTable nodes involved in the query
>>> (set_plan_refs does that). We need to distinguish nonleafResultRelations,
>>> because we need to take the stronger lock on a given table before any
>>> weaker one if it happens to appear in the query as a non-result relation
>>> too, to avoid lock strength upgrade deadlock hazard.
>>>
>>> Moreover, because all the tables from plannedstmt->rtable, including the
>>> partitioned tables, will be added to PlannedStmt.relationsOids, any
>>> invalidation events affecting the partitioned tables (for example,
>>> add/remove a partition) will cause the plan involving partitioned tables
>>> to be recreated.
>>>
>>> In none of this do we rely on the partitioned table RT indexes appearing
>>> in the (Merge)Append node itself. Maybe, we should just remove
>>> partitioned_rels from (Merge)AppendPath and (Merge)Append node in a
>>> separate patch and move on.
>>>
>>> Thoughts?
>>
>> Yes, I did the same analysis (to which I refer in my earlier reply to
>> you). I too think we should just remove partitioned_rels from Append
>> paths. But then the question is those are then transferred to
>> ModifyTable node in create_modifytable_plan() and use it for something
>> else. What should we do about that code? I don't think we are really
>> using that list from ModifyTable node as well, so may be we could
>> remove it from there as well. What do you think? Does that mean
>> partitioned_rels isn't used at all in the code?
>
> No, we cannot simply get rid of partitioned_rels altogether. We'll need
> to keep it in the ModifyTable node, because we *do* need the
> nonleafResultRelations list in PlannedStmt to distinguish partitioned
> table result relations, which set_plan_refs builds by concatenating
> partitioned_rels lists of various ModifyTable nodes of the query. The
> PlannedStmt.nonleafResultRelations list actually has some use (which
> parallels PlannedStmt.resultRelations), but partitioned_rels list in the
> individual (Merge)Append, as it turns out, doesn't.
>
> So, we can remove partitioned_rels from (Merge)AppendPath and
> (Merge)Append nodes and remove ExecLockNonLeafAppendTables().

Don't we need partitioned_rels from Append paths to be transferred to
ModifyTable node or we have a different way of calculating
nonleafResultRelations?

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2017-09-12 08:57:32 Re: no test coverage for ALTER FOREIGN DATA WRAPPER name HANDLER ...
Previous Message Amit Langote 2017-09-12 08:47:04 Re: Partition-wise join for join between (declaratively) partitioned tables