Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.

From: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
To: Etsuro Fujita <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Expression errors with "FOR UPDATE" and postgres_fdw with partition wise join enabled.
Date: 2018-07-06 11:20:39
Message-ID: CAFjFpRf2vd0C6B8g1TmpJN8RkPECgZy8iGe9UYZTy9EWu1eL2Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jul 6, 2018 at 4:29 PM, Etsuro Fujita
<fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
> (2018/07/04 21:37), Ashutosh Bapat wrote:
>>
>> On Wed, Jul 4, 2018 at 5:36 PM, Etsuro Fujita
>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>
>>> (2018/07/04 19:04), Ashutosh Bapat wrote:
>>>>
>>>> On Fri, Jun 29, 2018 at 6:21 PM, Etsuro Fujita
>>>> <fujita(dot)etsuro(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>>>>>
>>>>> (2018/06/22 22:54), Ashutosh Bapat wrote:
>>>>>>
>>>>>> + if (enable_partitionwise_join&&
>>>>>> rel->top_parent_is_partitioned)
>>>>>> + {
>>>>>> + build_childrel_tlist(root, rel, childrel, 1,&appinfo);
>>>>>> + }
>>>>>>
>>>>>> Why do we need rel->top_parent_is_partitioned? If a relation is
>>>>>> partitioned (if (rel->part_scheme), it's either the top parent or is
>>>>>> partition of some other partitioned table. In either case this
>>>>>> condition will be true.
>
>
>>>>> This would be needed to avoid unnecessarily applying
>>>>> build_childrel_tlist
>>>>> to
>>>>> child rels of a partitioned table for which we don't consider
>>>>> partitionwise
>>>>> join. Consider:
>>>>>
>>>>> postgres=# create table lpt (c1 int, c2 text) partition by list (c1);
>>>>> CREATE TABLE
>>>>> postgres=# create table lpt_p1 partition of lpt for values in (1);
>>>>> CREATE TABLE
>>>>> postgres=# create table lpt_p2 (c1 int check (c1 in (2)), c2 text);
>>>>> CREATE TABLE
>>>>> postgres=# create table test (c1 int, c2 text);
>>>>> CREATE TABLE
>>>>> postgres=# explain verbose select * from (select * from lpt union all
>>>>> select
>>>>> * from lpt_p2) ss(c1, c2) inner join test on (ss.c1 = test.c1);
>
>
>>> I might misunderstand your words, but in the above example the patch
>>> doesn't
>>> apply build_childrel_tlist to lpt_p1 and lpt_p2. The reason for that is
>>> because we can avoid adjusting the tlists for the corresponding subplans
>>> at
>>> plan creation time so that whole-row Vars in the tlists are transformed
>>> into
>>> CREs. I think the overhead of the adjustment is not that big, but not
>>> zero,
>>> so it would be worth avoiding applying build_childrel_tlist to partitions
>>> if
>>> the top parent won't participate in a partitionwise-join at all.
>>
>>
>> I don't think that answers my question. When we join lpt with test,
>> your patch will apply build_childrel_tlist() to lpt_p1 and lpt_p2 even
>> when join between lpt and test is not going to use partition-wise
>> join. Why?
>
>
> Maybe my explanation including the example was not good. Sorry about that,
> but my patch will *not* apply build_childrel_tlist to lpt_p1 and lpt_p2
> since the top parent of lpt_p1 and lpt_p2 is the UNION ALL subquery and
> hence not a partitioned table (ie, we have
> rel->top_parent_is_partitioned=false for lpt_p1 and lpt_p2).
>
>> As per your explanation, the condition "if
>> (enable_partitionwise_join&& rel->top_parent_is_partitioned)" is
>> used to avoid applying build_childrel_tlist() when partition-wise join
>> won't be possible. But it's not covering all the cases.
>
>
> Let me explain about that: 1) my patch won't apply that function to a child
> if its top parent is an appendrel built from a UNION ALL subquery, even
> though the child is a partition of a partitioned table pulled up from a leaf
> subquery into the parent query, like lpt_p1, and 2) my patch will apply that
> function to a child if its top parent is a partitioned table, whether or not
> the partitioned table is involved in a partitionwise join. By #1, we avoid
> the adjustment work at plan creation time, as explained above. It might be
> worth trying to be smarter about #2 (for example, in the case of a join of a
> partitioned table and a non-partitioned table, since we don't consider a
> partitionwise join for that join, it's better to not apply that function to
> partitions of the partitioned table, to avoid the adjustment work at plan
> creation time), but ISTM we don't have enough information to be smarter.

That looks like a kludge to me rather than a proper fix. It's not
clear to me as to when somebody can expect ConvertRowtypeExpr in the
targetlist and when don't while creating paths and to an extent plans.
For example, inside add_paths_to_append_rel() or in
apply_scanjoin_target_to_paths() or for that matter any path creation
or plan creation function, we will sometimes get targetlists with
ConvertRowtypeExpr() and sometime not. How do we know which is when.

>
>>>> A ConvertRowtypeExpr
>>>> is used to fix this inconsistency. That's why I chose to use
>>>> pull_var_clause() as a place to fix the problem and not fix
>>>> ConvertRowtypeExpr in targetlist itself.
>>>
>>>
>>>
>>> I think the biggest issue in the original patch you proposed is that we
>>> spend extra cycles where partitioning is not involved, which is the
>>> biggest
>>> reason why I think the original patch isn't the right way to go.
>>
>>
>> When there are no partitions involved, there won't be any
>> ConvertRowtypeExprs there which means the function
>> is_converted_whole_row_reference() would just return from the first
>> line checking IsA() and nullness of node.
>
>
> Really? IIRC, with the original patch we would spend extra cycles in a
> non-partitioned inheritance processing [1].

As I said, we do spend cycles in that function testing whether a node
is Aggref or not even when the query doesn't have aggregates or
grouping OR spend cycles in testing whether a node is a PlaceHolderVar
when the query doesn't produce any. So, I don't see any problem with
spending a few cycles testing whether a node is ConvertRowtypeExpr or
not when a ConvertRowtypeExpr is not in the query or command. That's
not a huge performance trouble. I would be happy to change my mind, if
you show me performance different with and without this patch in
planning. I haven't seen any.

--
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2018-07-06 11:36:58 Re: log_min_messages shows debug instead of debug2
Previous Message Michael Paquier 2018-07-06 11:18:59 Re: [HACKERS] Crash on promotion when recovery.conf is renamed