Re: speeding up planning with partitions

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions
Date: 2018-09-14 10:28:32
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Thanks a lot for your detailed review.

On 2018/09/11 8:23, David Rowley wrote:
> On 30 August 2018 at 21:29, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> Attached updated patches, with 0002 containing the changes mentioned above.
> Here's my first pass review on this:

I've gone through your comments on 0001 so far, but didn't go through
others yet, to which I'll reply separately.

> 0001:
> 1. I think the following paragraph should probably mention something
> about some performance difference between the two methods:
> <para>
> Constraint exclusion works in a very similar way to partition
> pruning, except that it uses each table's <literal>CHECK</literal>
> constraints &mdash; which gives it its name &mdash; whereas partition
> pruning uses the table's partition bounds, which exist only in the
> case of declarative partitioning. Another difference is that
> constraint exclusion is only applied at plan time; there is no attempt
> to remove partitions at execution time.
> </para>
> Perhaps tagging on. "Constraint exclusion is also a much less
> efficient way of eliminating unneeded partitions as metadata for each
> partition must be loaded in the planner before constraint exclusion
> can be applied. This is not a requirement for partition pruning."

Hmm, isn't that implied by the existing text? It already says that
constraint exclusion uses *each* table's/partition's CHECK constraints,
which should make it clear that for a whole lot of partitions, that will
be a slower than partition pruning which requires accessing only one
table's metadata.

If we will have dissociated constraint exclusion completely from
partitioned tables with these patches, I'm not sure if we have to stress
that it is inefficient for large number of tables.

> 2. I think "rootrel" should be named "targetpart" in:
> + RelOptInfo *rootrel = root->simple_rel_array[root->parse->resultRelation];

To me, "targetpart" makes a partitioned table sound like a partition,
which it is not. I get that using "root" can be ambiguous, because a
query can specify a non-root partitioned table.

How about "targetrel"?

> 3. Why does the following need to list_copy()?
> + List *saved_join_info_list = list_copy(root->join_info_list);

In earlier versions of this code, root->join_info_list would be modified
during make_one_rel_from_joinlist, but that no longer seems true.

Removed the list_copy.

> 4. Is the "root->parse->commandType != CMD_INSERT" required in:
> @@ -181,13 +185,30 @@ make_one_rel(PlannerInfo *root, List *joinlist)
> /*
> * Generate access paths for the entire join tree.
> + *
> + * If we're doing this for an UPDATE or DELETE query whose target is a
> + * partitioned table, we must do the join planning against each of its
> + * leaf partitions instead.
> */
> - rel = make_rel_from_joinlist(root, joinlist);
> + if (root->parse->resultRelation &&
> + root->parse->commandType != CMD_INSERT &&
> + root->simple_rel_array[root->parse->resultRelation] &&
> + root->simple_rel_array[root->parse->resultRelation]->part_scheme)
> + {
> Won't the simple_rel_array entry for the resultRelation always be NULL
> for an INSERT?

Yep, you're right. Removed.

> 5. In regards to:
> + /*
> + * Hack to make the join planning code believe that 'partrel' can
> + * be joined against.
> + */
> + partrel->reloptkind = RELOPT_BASEREL;
> Have you thought about other implications of join planning for other
> member rels, for example, equivalence classes and em_is_child?

Haven't really, but that seems like an important point. I will study it
more closely.

> 6. It would be good to not have to rt_fetch the same RangeTblEntry twice
> @@ -959,7 +969,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
> * needs special processing, else go straight to grouping_planner.
> */
> if (parse->resultRelation &&
> - rt_fetch(parse->resultRelation, parse->rtable)->inh)
> + rt_fetch(parse->resultRelation, parse->rtable)->inh &&
> + rt_fetch(parse->resultRelation, parse->rtable)->relkind !=
> inheritance_planner(root);

The new version doesn't call inheritance_planner at all; there is no
inheritance_planner in the new version.

> 7. Why don't you just pass the Parse into the function as a parameter
> instead of overwriting PlannerInfo's copy in:
> + root->parse = partition_parse;
> + partitionwise_adjust_scanjoin_target(root, child_rel,
> + subroots,
> + partitioned_rels,
> + resultRelations,
> + subpaths,
> + WCOLists,
> + returningLists,
> + rowMarks);
> + /* Restore the Query for processing the next partition. */
> + root->parse = parse;

Okay, done.

> 8. Can you update the following comment to mention why you're not
> calling add_paths_to_append_rel for this case?
> @@ -6964,7 +7164,9 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
> }
> /* Build new paths for this relation by appending child paths. */
> - if (live_children != NIL)
> + if (live_children != NIL &&
> + !(rel->reloptkind == RELOPT_BASEREL &&
> + rel->relid == root->parse->resultRelation))
> add_paths_to_append_rel(root, rel, live_children);

Oops, stray code from a previous revision. Removed this hunk.

> 9. The following should use >= 0, not > 0
> + while ((relid = bms_next_member(child_rel->relids, relid)) > 0)

Yeah, fixed.

Sorry, I haven't yet worked on your comments on 0002 and 0003. For time
being, I'd like to report what I've been up to these past couple of days,
because starting tomorrow until the end of this month, I won't be able to
reply to emails on -hackers due to personal vacation.

So, I spent a better part of last few days on trying to revise the patches
so that it changes the planning code for *all* inheritance cases instead
of just focusing on partitioning. Because, really, the only difference
between the partitioning code and regular inheritance code should be that
partitioning is able to use faster pruning, and all the other parts should
look and work more or less the same. There shouldn't be code here that
deals with partitioning and code there to deal with inheritance.
Minimizing code this way should be a good end to aim for, imho.

Attached is what I have at the moment. As part of the effort mentioned
above, I made 0001 to remove inheritance_planner altogether, instead of
just taking out the partitioning case out of it. Then there is the WIP
patch 0004 which tries to move even regular inheritance expansion to late
planning stage, just like the earlier versions did for partitioning. It
will need quite a bit of polishing before we could consider it for merging
with 0002. Of course, I'll need to address your comments before
considering doing that any seriously.


Attachment Content-Type Size
v3-0001-Overhaul-inheritance-update-delete-planning.patch text/plain 59.9 KB
v3-0002-Lazy-creation-of-partition-objects-for-planning.patch text/plain 49.1 KB
v3-0003-Only-lock-partitions-that-will-be-scanned-by-a-qu.patch text/plain 10.5 KB
v3-0004-WIP-generate-all-inheritance-child-RelOptInfos-af.patch text/plain 63.7 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2018-09-14 11:00:37 Re: Problem while setting the fpw with SIGHUP
Previous Message Masahiko Sawada 2018-09-14 10:23:13 Re: Connection slots reserved for replication