Re: why partition pruning doesn't work?

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(at)enterprisedb(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: why partition pruning doesn't work?
Date: 2018-06-10 22:49:41
Message-ID: 26279.1528670981@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
> I've made a pass over the execPartition.c and partprune.c code
> attempting to resolve these issues. I have hopefully fixed them all,
> but I apologise if I've missed any.
> I also couldn't resist making a few other improvements to the code.

By the time this arrived, I'd already whacked around your v4 patch
quite a bit, so rather than start over I just kept going with what
I had, and then tried to merge the useful bits of this one after
the fact. I intentionally left out a couple of changes I couldn't
get excited about (such as having find_subplans_for_params_recurse
return a count), but I think I got everything in v5 otherwise.

I'm still fairly unhappy about the state of the comments, though.
It's very unclear for example what the subplan_map and subpart_map
arrays really are, eg what are they indexed by? I get the impression
that only one of them can have a non-minus-1 value for a given index,
but that's nowhere explained. Also, we have

* partprunedata Array of PartitionPruningData for the plan's target
* partitioned relation. First element contains the
* details for the target partitioned table.

And? What are the other elements, what's the index rule, is there a
specific ordering for the other elements? For that matter, "target
partitioned table" is content-free. Do you mean topmost partitioned
table? I suspect we expect the hierarchy to be flattened such that
ancestors appear before children, but that's not stated --- and if it
were, this bit about the first element would be a consequence of that.

Code-wise, there are some loose ends to be looked at.

* Related to the above, doesn't the loop at execPartition.c:1658 need
to work back-to-front? Seems like it's trying to propagate info upwards
in the hierarchy; looking at a subpartition's present_parts value when
you still might change it later doesn't look right at all.

* partkey_datum_from_expr and its caller seem pretty brain-dead with
respect to nulls. It's not even considering the possibility that a
Const has constisnull = true. Now perhaps such a case can't reach
here because of plan-time constant-folding, but I don't think this code
has any business assuming that. It's also being very stupid about
null values from expressions, just throwing up its hands and supposing
that nothing can be proven. In reality, doesn't a null guarantee we
can eliminate all partitions, since the comparison functions are
presumed strict?

* I'm fairly suspicious of the fmgr_info and fmgr_info_copy calls in
perform_pruning_base_step. Those seem likely to leak memory, and
for sure they destroy any opportunity for the called comparison
function to cache info in fn_extra --- something that's critical
for performance for some comparison functions such as array_eq.
Why is it necessary to suppose that those functions could change
from one execution to the next?

* The business with ExecFindInitialMatchingSubPlans remapping the
subplan indexes seems very dubious to me. Surely, that is adding
way more complexity and fragility than it's worth.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-06-11 00:19:46 Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian
Previous Message Devrim Gündüz 2018-06-10 20:55:09 Re: Failed rpm package signature checks with reposync