Re: speeding up planning with partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>
Cc: "'David Rowley'" <david(dot)rowley(at)2ndquadrant(dot)com>, "'Alvaro Herrera'" <alvherre(at)2ndquadrant(dot)com>, "'Pg Hackers'" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions
Date: 2018-12-26 04:28:37
Message-ID: 580ddb99-cd31-096d-d485-fc309d274498@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Thank you Imai-san.

On 2018/12/25 16:47, Imai, Yoshikazu wrote:
> Here's the continuation of the review. Almost all of below comments are
> little fixes.
>
> ---
> 0001: line 76-77
> In commit message:
> exclusion for target child relation, which is no longer
> is no longer needed. Constraint exclusion runs during query_planner
>
> s/which is no longer is no longer needed/which is no longer needed/
>
> ---
> 0001: line 464
> + if (IS_DUMMY_REL(find_base_rel(root, resultRelation )))
>
> s/resultRelation )))/resultRelation)))/
> (There is an extra space.)
>
> ---
> 0001: line 395-398
> + * Reset inh_target_child_roots to not be same as parent root's so that
> + * the subroots for this child's own children (if any) don't end up in
> + * root parent's list. We'll eventually merge all entries into one list,
> + * but that's now now.
>
> s/that's now now/that's not now/
>
> ---
> 0001: line 794
> + * are put into a list that will be controlled by a single ModifyTable
>
> s/are put into a list/are put into a list/
> (There are two spaces between "into" and "a".)

All fixed in my local repository.

> ---
> 0001: line 241-242, 253-254, 291-294 (In set_append_rel_size())
>
> + if (appinfo->parent_relid == root->parse->resultRelation)
> + subroot = adjust_inherit_target_child(root, childrel, appinfo);
>
> + add_child_rel_equivalences(subroot, appinfo, rel, childrel,
> + root != subroot);
>
> + if (subroot != root)
> + {
> + root->inh_target_child_roots =
> + lappend(root->inh_target_child_roots, subroot);
>
> A boolean value of "appinfo->parent_relid == root->parse->resultRelation" is
> same with "subroot != root"(because of line 241-242), so we can use bool
> variable here like
> child_is_target = (appinfo->parent_relid == root->parse->resultRelation).
> The name of child_is_target is brought from arguments of
> add_child_rel_equivalences() in your patch.
>
> I attached a little diff as "v9-0001-delta.diff".

Sounds like a good idea for clarifying the code, so done.

> ---
> 0001: line 313-431
>
> adjust_inherit_target_child() is in allpaths.c in your patch, but it has the
> code like below ones which are in master's(not patch applied) planner.c or
> planmain.c, so could it be possible in planner.c(or planmain.c)?
> This point is less important, but I was just wondering whether
> adjust_inherit_target_child() should be in allpaths.c, planner.c or planmain.c.
>
> + /* Translate the original query's expressions to this child. */
> + subroot = makeNode(PlannerInfo);
> + memcpy(subroot, root, sizeof(PlannerInfo));
>
> + root->parse->targetList = root->unexpanded_tlist;
> + subroot->parse = (Query *) adjust_appendrel_attrs(root,
> + (Node *) root->parse,
> + 1, &appinfo);
>
> + tlist = preprocess_targetlist(subroot);
> + subroot->processed_tlist = tlist;
> + build_base_rel_tlists(subroot, tlist);

I'm thinking of changing where adjust_inherit_target_child gets called
from. In the current patch, it's in the middle of set_rel_size which I'm
starting to think is a not a good place for it. Maybe, I'll place the
call call near to where inheritance is expanded.

> ---
> 0001: line 57-70
>
> In commit message:
> This removes some existing code in inheritance_planner that dealt
> with any subquery RTEs in the query. The rationale of that code
> was that the subquery RTEs may change during each iteration of
> planning (that is, for different children), so different iterations
> better use different copies of those RTEs.
> ...
> Since with the new code
> we perform planning just once, I think we don't need this special
> handling.
>
> 0001: line 772-782
> - * controlled by a single ModifyTable node. All the instances share the
> - * same rangetable, but each instance must have its own set of subquery
> - * RTEs within the finished rangetable because (1) they are likely to get
> - * scribbled on during planning, and (2) it's not inconceivable that
> - * subqueries could get planned differently in different cases. We need
> - * not create duplicate copies of other RTE kinds, in particular not the
> - * target relations, because they don't have either of those issues. Not
> - * having to duplicate the target relations is important because doing so
> - * (1) would result in a rangetable of length O(N^2) for N targets, with
> - * at least O(N^3) work expended here; and (2) would greatly complicate
> - * management of the rowMarks list.
>
> I used considerable time to confirm there are no needs copying subquery RTEs in
> the new code, but so far I couldn't. If copied RTEs are only used when planning,
> it might not need to copy RTEs in the new code because we perform planning just
> once, so I tried to detect when copied RTEs are used or overwritten, but I gave
> up.
>
> Of course, there are comments about this,
>
> - * same rangetable, but each instance must have its own set of subquery
> - * RTEs within the finished rangetable because (1) they are likely to get
> - * scribbled on during planning, and (2) it's not inconceivable that
>
> so copied RTEs might be used when planning, but I couldn't find the actual codes.
> I also checked commits[1, 2] related to these codes. I'll check these for more
> time but it would be better there are other's review and I also want a help here.
>
> [1]
https://github.com/postgres/postgres/commit/b3aaf9081a1a95c245fd605dcf02c91b3a5c3a29
> [2]
https://github.com/postgres/postgres/commit/c03ad5602f529787968fa3201b35c119bbc6d782

Thank you very much for spending time on that. I'd really like someone
else to consider this as well.

Here's the summary of what I'm proposing to change with respect to the
above: because we perform scan-level planning only once for any given
relation including for sub-queries with the patch applied, we no longer
need to make copies of sub-query RTEs that are currently needed due to
repeated planning, with one copy per child target relation. Since there
are no new copies, there is no need to process various nodes to change the
RT index being used to refer to the sub-query. I have removed the code
that does to copying of subquery RTEs and the code that does translation
due new RT index being assigned every time a new copy was being made.

> ---
> Maybe I checked all the way of the v9 patch excluding the codes about
> EquivalenceClass codes(0001: line 567-638).
> I'll consider whether there are any performance degration case, but I have
> no idea for now. Do you have any points you concerns? If there, I'll check it.

I will send an updated patch hopefully before my new year vacation that
starts on Friday this week.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2018-12-26 04:46:48 Re: Alter table documentation page (again)
Previous Message Nagaura, Ryohei 2018-12-26 04:09:40 RE: Timeout parameters