Re: speeding up planning with partitions

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
Cc: Imai Yoshikazu <yoshikazu_i443(at)live(dot)jp>, "jesper(dot)pedersen(at)redhat(dot)com" <jesper(dot)pedersen(at)redhat(dot)com>, "Imai, Yoshikazu" <imai(dot)yoshikazu(at)jp(dot)fujitsu(dot)com>, "'Amit Langote'" <amitlangote09(at)gmail(dot)com>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: speeding up planning with partitions
Date: 2019-03-22 21:07:02
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers

Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> writes:
> [ v34 patch set ]

I had a bit of a look through this. I went ahead and pushed 0008 and
0009, as they seem straightforward and independent of the rest. (BTW,
0009 makes 0003's dubious optimization in set_relation_partition_info
quite unnecessary.) As for the rest:

0001: OK in principle, but I wonder why you implemented it by adding
another recursive scan of the jointree rather than just iterating
over the baserel array, as in make_one_rel() for instance. Seems
like this way is more work, and more code that we'll have to touch
if we ever change the jointree representation.

I also feel like you used a dartboard while deciding where to insert the
call in query_planner(); dropping it into the middle of a sequence of
equivalence-class-related operations seems quite random. Maybe we could
delay that step all the way to just before make_one_rel, since the other
stuff in between seems to only care about baserels? For example,
it'd certainly be better if we could run remove_useless_joins before
otherrel expansion, so that we needn't do otherrel expansion at all
on a table that gets thrown away as being a useless join.

BTW, it strikes me that we could take advantage of the fact that
baserels must all appear before otherrels in the rel array, by
having loops over that array stop early if they're only interested
in baserels. We could either save the index of the last baserel,
or just extend the loop logic to fall out upon hitting an otherrel.
Seems like this'd save some cycles when there are lots of partitions,
although perhaps loops like that would be fast enough to not matter.

0002: I seriously dislike this from a system-structure viewpoint.
For one thing it makes root->processed_tlist rather moot, since it
doesn't get set till after most of the work is done. (I don't know
if there are any FDWs out there that look at processed_tlist, but
they'd be broken by this if so. It'd be better to get rid of the
field if we can, so that at least such breakage is obvious. Or,
maybe, use root->processed_tlist as the communication field rather
than having the tlist be a param to query_planner?) I also
don't like the undocumented way that you've got build_base_rel_tlists
working on something that's not the final tlist, and then going back
and doing more work of the same sort later. I wonder whether we can
postpone build_base_rel_tlists until after the other stuff happens,
so that it can continue to do all of the tlist-distribution work.

0003: I haven't studied this in great detail, but it seems like there's
some pretty random things in it, like this addition in

+ /* grouping_planner doesn't need to see the target children. */
+ subroot->append_rel_list = list_copy(orig_append_rel_list);

That sure seems like a hack unrelated to the purpose of the patch ... and
since list_copy is a shallow copy, I'm not even sure it's safe. Won't
we be mutating the contained (and shared) AppendRelInfos as we go along?

0004 and 0005: I'm not exactly thrilled about loading more layers of
overcomplication into inheritance_planner, especially not when the
complication then extends out into new global planner flags with
questionable semantics. We should be striving to get rid of that
function, as previously discussed, and I fear this is just throwing
more roadblocks in the way of doing so. Can we skip these and come
back to the question after we replace inheritance_planner with

0006: Seems mostly OK, but I'm not very happy about the additional
table_open calls it's throwing into random places in the planner.
Those can be rather expensive. Why aren't we collecting the appropriate
info during the initial plancat.c consultation of the relcache?

0007: Not really sold on this; it seems like it's going to be a net
negative for cases where there aren't a lot of partitions.

regards, tom lane

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-03-22 21:09:23 Re: rename labels in heapam.c?
Previous Message Andres Freund 2019-03-22 20:58:42 rename labels in heapam.c?