Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian

From: Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Phil Florent <philflorent(at)hotmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Internal error XX000 with enable_partition_pruning=on, pg 11 beta1 on Debian
Date: 2018-07-25 08:28:11
Message-ID: 38568bd4-56fa-4173-703b-fb84c09b5fd1@lab.ntt.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2018/07/21 0:17, David Rowley wrote:
> On 20 July 2018 at 21:44, Amit Langote <Langote_Amit_f8(at)lab(dot)ntt(dot)co(dot)jp> wrote:
>> But I don't think the result of make_partition_pruneinfo itself has to be
>> List of PartitionedRelPruneInfo nested under PartitionPruneInfo. I gather
>> that each PartitionPruneInfo corresponds to each root partitioned table
>> and a PartitionedRelPruneInfo contains the actual pruning information,
>> which is created for every partitioned table (non-leaf tables), including
>> the root tables. I don't think such nesting is necessary. I think that
>> just like flattened partitioned_rels list, we should put flattened list of
>> PartitionedRelPruneInfo into the Append or MergeAppend plan. No need for
>> nesting PartitionedRelPruneInfo under PartitionPruneInfo.
>
> To do that properly you'd need to mark the target partitioned table of
> each hierarchy. Your test of pg_class.relispartition is not going to
> work as you're assuming the query is always going to the root. The
> user might query some intermediate partitioned table (which will have
> relispartition = true). Your patch will fall flat in that case.

Yeah, I forgot to consider that.

> You could work around that by having some array that points to the
> target partitioned table of each hierarchy, but I don't see why that's
> better than having the additional struct.

Or it could be a Bitmapset called root_indexes that stores the offset of
the first Index value in every partitioned_rels list contained in turn in
the list that's passed to make_partition_pruneinfo.

> There's also some code
> inside ExecFindInitialMatchingSubPlans() which does a backward scan
> over the partitions. This must process children before their parents.
> Unsure how well that's going to work if we start mixing the
> hierarchies. I'm sure it can be made to work providing each hierarchy
> is stored together consecutively in the array, but it just seems
> pretty fragile to me. That code is already pretty hard to follow.

I don't see how removing a nested loop changes things for worse. AIUI,
the code replaces index values contained in the subplan_map arrays of
various PartitionedRelPruningData structs to account for any pruned
sub-plans. Removing a nesting level because of having removed the nesting
struct doesn't seem to affect anything about that translation. But your
point here seems to be about the relative ordering of
PartitionedRelPruningData structs among themselves being affected due to
their now being put into a flat array, although I don't see that as being
any more fragile. We already are assuming a bunch about the relative
ordering of sub-plans or of PartitionedRelPruningData structs to have been
relying on storing their indexes in subplan_map and subpart_map. Also, it
occurred to me that the new subplan indexes that
ExecFindInitialMatchingSubPlans computes are based on where subplans are
actually stored in the AppendState.appendplans array, which, in turn, is
based on the Bitmapset of "valid subplans" that
ExecFindInitialMatchingSubPlans passes back to ExecInitAppend.

> What's the reason you don't like the additional level to represent
> multiple hierarchies?

I started thinking about flattening PartitionedRelPruneInfo after looking
at flatten_partitioned_rels() in your patch. If we're flattening
partitioned_rels (that is, not having it as a List of Lists in the
finished plan), why not flatten the pruning info node too? As I said
earlier, I get it that we need List of Lists within the planner to get
make_partition_pruneinfo to work correctly in these types of queries, but
once we have figured out the correct details to pass to executor to
perform run-time pruning, I don't see why we need to pass that info again
as a List of Lists.

I have attached v2 of the delta patch which adds a root_indexes field to
PartitionPruneInfo to track topmost parents in every partition hierarchy
contained whose pruning info is contained in the Append.

Also, I noticed a bug with how ExecFindInitialMatchingSubPlans handles
other_subplans. While the indexes in subplan_map arrays are updated to
contain revised values after pruning, those in the other_subplans
Bitmapset are not, leading to crashes or possibly wrong result. For example:

create table p (a int, b int, c int) partition by list (a);
create table p1 partition of p for values in (1);
create table p2 partition of p for values in (2);
create table q (a int, b int, c int) partition by list (a);
create table q1 partition of q for values in (1) partition by list (b);
create table q11 partition of q1 for values in (1) partition by list (c);
create table q111 partition of q11 for values in (1);
create table q2 partition of q for values in (2) partition by list (b);
create table q21 partition of q2 for values in (1);
create table q22 partition of q2 for values in (2);

prepare q (int, int) as
select *
from (
select * from p
union all
select * from q1
union all
select 1, 1, 1
) s(a, b, c)
where s.a = $1 and s.b = $2 and s.c = (select 1);

set plan_cache_mode TO force_generic_plan;

explain (costs off, analyze) execute q (1, 1);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

I have attached a fix for that as a delta patch, which results in:

explain (costs off, analyze) execute q (1, 1);
QUERY PLAN
──────────────────────────────────────────────────────────────────
Append (actual time=0.153..0.179 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Result (actual time=0.023..0.032 rows=1 loops=1)
Subplans Removed: 1
-> Seq Scan on p1 (actual time=0.022..0.022 rows=0 loops=1)
Filter: ((a = $1) AND (b = $2) AND (c = $0))
-> Seq Scan on q111 (actual time=0.012..0.012 rows=0 loops=1)
Filter: ((a = $1) AND (b = $2) AND (c = $0))
-> Result (actual time=0.014..0.022 rows=1 loops=1)
One-Time Filter: ((1 = $1) AND (1 = $2) AND (1 = $0))
Planning Time: 8.136 ms
Execution Time: 0.562 ms
(12 rows)

Thanks,
Amit

Attachment Content-Type Size
v3-0001-delta-v2.patch text/plain 30.7 KB
v3-0001-other_subplans-bug-delta.patch text/plain 820 bytes

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2018-07-25 09:09:05 Re: ToDo: show size of partitioned table
Previous Message Benjamin Scherrey 2018-07-25 07:48:01 Re: How can we submit code patches that implement our (pending) patents?