Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Phil Florent <philflorent(at)hotmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian
Date: 2018-06-15 08:37:49
Message-ID: 2776b664-b175-5ebc-c8c9-d234a2f733b4@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/06/11 16:49, David Rowley wrote:
> On 11 June 2018 at 12:19, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>>> On 10 June 2018 at 04:48, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>>>> So, IIUC, the issue is that for partitioning cases Append expects *all*
>>>> its children to be partitions of the *same* partitioned table? That
>>>> is, you could also break it with
>>>>
>>>> select * from partitioned_table_a
>>>> union all
>>>> select * from partitioned_table_b
>>>>
>>>> ?
>>
>>> Not quite.

That would be correct I think. An Append may contain multiple partitioned
tables that all appear under an UNION ALL parent, as in the OP's case and
the example above. In this case, the partitioned_rels list of Append
consist of non-leaf tables from *all* of the partitioned tables. Before
run-time pruning arrived, the only purpose of partitioned_rels list was to
make sure that the executor goes through it and locks all of those
non-leaf tables (ExecLockNonLeafAppendTables). Run-time pruning expanded
its usage by depending it to generate run-time pruning info.

>> I just had a thought that might lead to a nice solution to that, or
>> might be totally crazy. What if we inverted the sense of the bitmaps
>> that track partition pruning state, so that instead of a bitmap of
>> valid partitions that need to be scanned, we had a bitmap of pruned
>> partitions that we know we don't need to scan? (The indexes of this
>> bitmap would be subplan indexes not partition indexes.) With this
>> representation, it doesn't matter if some of the Append's children
>> are not supposed to participate in pruning; they just don't ever get
>> added to the bitmap of what to skip. It's also fairly clear, I think,
>> how to handle independent pruning rules for different top-level tables
>> that are being unioned together: just OR the what-to-skip bitmaps.
>> But there may be some reason why this isn't workable.
>
> I think it would be less efficient. A common case and one that I very
> much would like to make as fast as possible is when all but a single
> partition is pruned. Doing the opposite sounds like more effort would
> need to be expended to get the subplans that we do need to scan.
>
> I don't really see the way it works now as a huge problem to overcome
> in pruning. We'd just a list of subplans that don't belong to the
> hierarchy and tag them on to the matches found in
> ExecFindInitialMatchingSubPlans and ExecFindMatchingSubPlans. The
> bigger issue to overcome is the mixed flattened list of partition RT
> indexes in partitioned_rels. Perhaps having a list of Lists for
> partitioned_rels could be used to resolve that. The question is more,
> how to solve for PG11. Do we need that?
>
> I think we'll very soon be wanting to have ordered partition scans
> where something like:
>
> create table listp(a int) partition by list(a);
> create index on listp(a);
> create table listp1 partition of listp for values in (1);
> create table listp2 partition of listp for values in (2);
>
> and
>
> select * from listp order by a;
>
> would be possible with an Append and Index Scan, rather than having a
> MergeAppend or Sort. In which case we'll not want mixed partition
> hierarchies in the Append subplans. Although, perhaps that would mean
> we just wouldn't pullup AppendPaths which have PathKeys.
>
> I have written and attached the patch to stop flattening of
> partitioned tables into UNION ALL parent's paths, meaning we can now
> get nested Append and MergeAppend paths.
>
> I've added Robert too as I know he was the committer of partitioning
> and parallel Append. Maybe he has a view on what should be done about
> this? Is not flattening the paths a problem?

Not speaking for Robert here, just saying from what I know.

I don't think your patch breaks anything, even if does change the shape of
the plan. So, for:

select * from partitioned_table_a
union all
select * from partitioned_table_b

The only thing that changes with the patch is that
ExecLockNonLeafAppendTables is called *twice* for the two nested Appends
corresponding to partitioned_table_a and partitioned_table_b, resp.,
instead of just once for the top level Append corresponding to the UNION
ALL parent. In fact, when called for the top level Append,
ExecLockNonLeafAppendTables is now a no-op.

Thanks,
Amit

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-06-15 08:39:02 Re: why partition pruning doesn't work?
Previous Message Masahiko Sawada 2018-06-15 08:37:41 Re: Index maintenance function for BRIN doesn't check RecoveryInProgress()