Re: Partition-wise join for join between (declaratively) partitioned tables

From: Rafia Sabih <rafia(dot)sabih(at)enterprisedb(dot)com>
To: Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Rajkumar Raghuwanshi <rajkumar(dot)raghuwanshi(at)enterprisedb(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Partition-wise join for join between (declaratively) partitioned tables
Date: 2017-03-22 08:17:56
Message-ID: CAOGQiiOxb8gMO_wDE0=qPAVHpajtvZOxeBhkM4Gf_VD2VpSy0A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Mar 21, 2017 at 10:40 PM, Ashutosh Bapat
<ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
> Here's the set of patches rebased on latest head, which also has a
> commit to eliminate scans on partitioned tables. This change has
> caused problems with multi-level partitioned tables, that I have not
> fixed in this patch set. Also a couple of partition-wise join plans
> for single-level partitioned tables have changed to non-partition-wise
> joins. I haven't fixed those as well.
>
> I have added a separate patch to fix add_paths_to_append_rel() to
> collect partitioned_rels list for join relations. Please let me know
> if this looks good. I think it needs to be merged into some other
> patch, but I am not sure which. Probably we should just treat it as
> another refactoring patch.
>
> On Tue, Mar 21, 2017 at 5:16 PM, Ashutosh Bapat
> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>> On Mon, Mar 20, 2017 at 11:33 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>> On Mon, Mar 20, 2017 at 1:19 PM, Ashutosh Bapat
>>> <ashutosh(dot)bapat(at)enterprisedb(dot)com> wrote:
>>>>> That seems different than what I suggested and I'm not sure what the
>>>>> reason is for the difference?
>>>>
>>>> The patch adding macros IS_JOIN_REL() and IS_OTHER_REL() and changing
>>>> the code to use it will look quite odd by itself. We are not changing
>>>> all the instances of RELOPT_JOINREL or RELOPT_OTHER_MEMBER_REL to use
>>>> those. There is code which needs to check those kinds, instead of "all
>>>> join rels" or "all other rels" resp. So the patch will add those
>>>> macros, change only few places to use those macros, which are intended
>>>> to be changed while applying partition-wise join support for single
>>>> level partitioned table.
>>>
>>> Hmm. You might be right, but I'm not convinced.
>>
>> Ok. changed as per your request in the latest set of patches.
>>
>> There are some more changes as follows
>> 1. In the earlier patch set the changes related to
>> calc_nestloop_required_outer() and related functions were spread
>> across multiple patches. That was unintentional. This patch set has
>> all those changes in a single patch.
>>
>> 2. Rajkumar reported a crash offlist. When one of the joining
>> multi-level partitioned relations is empty, an assertion in
>> try_partition_wise_join() Assert(rel1->part_rels && rel2->part_rels);
>> failed since it didn't find part_rels for a subpartition. The problem
>> here is set_append_rel_size() does not call set_rel_size() and hence
>> set_append_rel_size() if a child is found to be empty, a scenario
>> described in [1]. It's the later one which sets the part_rels for a
>> partitioned relation and hence the subpartitions do not get part_rels
>> since set_append_rel_size() is never called for those. Generally, if a
>> partitioned relation is found to be empty before we set part_rels, we
>> may not want to spend time in creating/collecting child RelOptInfos,
>> since they will be empty anyway. If part_rels isn't present,
>> part_scheme doesn't make sense. So an empty partitioned table without
>> any partitions can be treated as unpartitioned. So, I have fixed
>> set_dummy_rel_pathlist() and mark_dummy_rel(), the functions setting a
>> relation empty, to reset partition scheme when those conditions are
>> met. This fix is included as a separate patch. Let me know if this
>> looks good to you.
>>
>> 3. I am in the process of completing reparameterize_paths_by_child()
>> by adding all possible paths. I have restructured the function to look
>> better and have one switch case instead of two. Also added more path
>> types including ForeignPath, for which I have added a FDW hook, with
>> documentation, for handling fdw_private. Please let me know if this
>> looks good to you. I am thinking of similar hook for CustomPath. I
>> will continue to add more path types to
>> reparameterize_path_by_child().
>>
>> I am wondering whether we should bring 0007 he patche adjusting code
>> to work with child-joins before 0006, partition-wise join. 0006 needs
>> it, but 0007 doesn't depend upon 0006. Will that be any better?
>>
>> [1] CAFjFpRcdrdsCRDbBu0J2pxwWbhb_sDWQUTVznBy_4XGr-p3+wA(at)mail(dot)gmail(dot)com,
>> subject "Asymmetry between parent and child wrt "false" quals"
>>
In an attempt to test the geqo side of this patch, I reduced
geqo_threshold to 6 and set enable_partitionwise_join to to true and
tried following query, which crashed,

explain select * from prt, prt2, prt3, prt32, prt4, prt42 where prt.a
= prt2.b and prt3.a = prt32.b and prt4.a = prt42.b and prt2.a > 1000
order by prt.a desc;

Stack-trace for the crash is as follows,

Program received signal SIGSEGV, Segmentation fault.
0x00000000007a43d1 in find_param_path_info (rel=0x2d3fe30,
required_outer=0x2ff6d30) at relnode.c:1534
1534 if (bms_equal(ppi->ppi_req_outer, required_outer))
(gdb) bt
#0 0x00000000007a43d1 in find_param_path_info (rel=0x2d3fe30,
required_outer=0x2ff6d30) at relnode.c:1534
#1 0x000000000079b8bb in reparameterize_path_by_child
(root=0x2df7550, path=0x2f6dec0, child_rel=0x2d4a860) at
pathnode.c:3455
#2 0x000000000075be30 in try_nestloop_path (root=0x2df7550,
joinrel=0x2ff51b0, outer_path=0x2f96540, inner_path=0x2f6dec0,
pathkeys=0x0,
jointype=JOIN_INNER, extra=0x7fffe6b4e130) at joinpath.c:344
#3 0x000000000075d55b in match_unsorted_outer (root=0x2df7550,
joinrel=0x2ff51b0, outerrel=0x2d4a860, innerrel=0x2d3fe30,
jointype=JOIN_INNER,
extra=0x7fffe6b4e130) at joinpath.c:1389
#4 0x000000000075bc5f in add_paths_to_joinrel (root=0x2df7550,
joinrel=0x2ff51b0, outerrel=0x2d4a860, innerrel=0x2d3fe30,
jointype=JOIN_INNER,
sjinfo=0x3076bc8, restrictlist=0x3077168) at joinpath.c:234
#5 0x000000000075f1d5 in populate_joinrel_with_paths (root=0x2df7550,
rel1=0x2d3fe30, rel2=0x2d4a860, joinrel=0x2ff51b0, sjinfo=0x3076bc8,
restrictlist=0x3077168) at joinrels.c:793
#6 0x0000000000760107 in try_partition_wise_join (root=0x2df7550,
rel1=0x2d3f6d8, rel2=0x2d4a1a0, joinrel=0x30752f0,
parent_sjinfo=0x7fffe6b4e2d0,
parent_restrictlist=0x3075768) at joinrels.c:1401
#7 0x000000000075f0e6 in make_join_rel (root=0x2df7550,
rel1=0x2d3f6d8, rel2=0x2d4a1a0) at joinrels.c:744
#8 0x0000000000742053 in merge_clump (root=0x2df7550,
clumps=0x3075270, new_clump=0x30752a8, force=0 '\000') at
geqo_eval.c:260
#9 0x0000000000741f1c in gimme_tree (root=0x2df7550, tour=0x2ff2430,
num_gene=6) at geqo_eval.c:199
#10 0x0000000000741df5 in geqo_eval (root=0x2df7550, tour=0x2ff2430,
num_gene=6) at geqo_eval.c:102
#11 0x000000000074288a in random_init_pool (root=0x2df7550,
pool=0x2ff23d0) at geqo_pool.c:109
#12 0x00000000007422a6 in geqo (root=0x2df7550, number_of_rels=6,
initial_rels=0x2ff22d0) at geqo_main.c:114
#13 0x0000000000747f19 in make_rel_from_joinlist (root=0x2df7550,
joinlist=0x2dce940) at allpaths.c:2333
#14 0x0000000000744e7e in make_one_rel (root=0x2df7550,
joinlist=0x2dce940) at allpaths.c:182
#15 0x0000000000772df9 in query_planner (root=0x2df7550,
tlist=0x2dec2c0, qp_callback=0x777ce1 <standard_qp_callback>,
qp_extra=0x7fffe6b4e700)
at planmain.c:254

Please let me know if any more information is required on this.
--
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Seki, Eiji 2017-03-22 08:25:16 Re: BRIN de-summarize ranges
Previous Message Amit Langote 2017-03-22 08:00:53 Re: Bug in get_partition_for_tuple