Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>, Phil Florent <philflorent(at)hotmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian
Date: 2018-07-16 00:13:42
Message-ID: CAKJS1f9012Yf_VYrgvAuW3E+ykzE_L-GMRGtAiY1xbAd861Z=g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 16 July 2018 at 06:55, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> First off, given that we're now going to have a Plan data structure
> that accurately reflects the partition hierarchy, I wonder whether we
> couldn't get rid of the fiddling around with lists of ints and lists of
> lists of ints; aren't those basically duplicative?

I'm a bit confused by this. I get that you're talking about the
partitioned_rels List, but without that list then how would
make_partition_pruneinfo() know what the subnode's parents are?
Perhaps we could add a relid field in RelOptInfo to mark the direct
parent of a but that does not make getting a unique list very quick
when given a List of subplans. A Bitmapset could be used, but we'll
end up with a mixed hierarchy with UNION ALL parents. Unsure how to
separate those again without some complex processing.

I'm not objecting to improving this as I don't really like that list,
but I just can't quite think of how else to represent the partition
hierarchy.

> They'd certainly be
> so if we add the rel's RT index to PartitionedRelPruneInfo, but maybe
> they are even without that. Alvaro complained about the code associated
> with those lists not being very idiomatic, but I'd like to just get rid
> of the lists entirely. I especially don't care for keeping the implicit
> ordering assumptions in those lists (parents before children) when the
> same info is now going to be explicit in a nearby structure. (This
> ties into the remarks I just made to Amit about getting rid of
> executor-startup lock-taking logic. Most of that change only makes
> sense to do in v12, but I'd prefer that this patch not double down on
> reliance on data structures we'd like to get rid of.)

Right, but I need to use something for v11. Do you want to see two
separate patches here? If we're not going to change this in v11 then
I still need to use something to describe the partition hierarchy so
that make_partition_pruneinfo() knows how to build the data structures
for run-time pruning.

> Second, I still feel that you've got the sense of the prunable-subplans
> bitmaps backwards, and I do not buy the micro-optimization argument you
> made for doing it like that. It complicates the code, yet the cost of
> one bit in a bitmap is completely negligible compared to all the other
> costs involved in having a per-partition subplan, whether or not that
> subplan actually gets used in a particular run.

But get_matching_partitions() returns the set of matching partitions,
not the set that does not match. It sounds like doing it the way you
ask would require inverting the Bitmapset returned by that. I don't
quite understand why you think this is more simple to implement. I
can't see how we'd map the non-matching partitions into subplan
indexes for the Append node. Can you give a bit more detail on your
design for this?

> One very minor quibble is that I'd be inclined to represent the
> PartitionPruningData struct like this:
>
> typedef struct PartitionPruningData
> {
> int num_partrelprunedata;
> PartitionedRelPruningData partrelprunedata[FLEXIBLE_ARRAY_MEMBER];
> } PartitionPruningData;
>
> so we can allocate it with one palloc not two.

That could be done, sort of. Only I currently allocate the array which
stores these PartitionPruningDatas as one chunk of memory. I can do
that today because the PartitionPruningData struct is a fixed size. If
you want to make it have a flexible size then I'd need to allocate an
array of pointers in the PartitionPruningState... or use a
FLEXIBLE_ARRAY_MEMBER of pointers there too. I've done it that way
locally for now.

> Also, in the Plan
> representation, I'd suggest avoiding situations where a data structure
> is sometimes a List of Lists of PartitionedRelPruneInfo and sometimes
> just a single-level List. Saving one ListCell isn't worth the code
> complexity and error-proneness of that.

I'll make that change. But I'm confused; was the first thing you
mentioned above not to get rid of this list?

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2018-07-16 00:24:59 Re: why partition pruning doesn't work?
Previous Message Robert Haas 2018-07-16 00:12:37 Re: why partition pruning doesn't work?