Re: Parallel Append can break run-time partition pruning

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Robert Haas <robert(dot)haas(at)enterprisedb(dot)com>, amitdkhan(dot)pg(at)gmail(dot)com, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel Append can break run-time partition pruning
Date: 2020-04-17 07:07:54
Message-ID: CA+HiwqG7mSYLhCkGLMyAfx_mWAAyDzV_DyN5K2CdTgzXXgofAw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Apr 15, 2020 at 4:18 PM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> I'm a bit divided on what the correct fix is. If I blame Parallel
> Append for not trying hard enough to pull up the lower Append in
> accumulate_append_subpath(), then clearly the parallel append code is
> to blame.

I spent some time trying to understand how Append parallelism works
and I am tempted to agree with you that there might be problems with
how accumulate_append_subpath()'s interacts with parallelism. Maybe it
would be better to disregard a non-parallel-aware partial Append if it
requires us to fail on flattening a child Append. I have as attached
a PoC fix to show that. While a nested Append is not really a problem
in general, it appears to me that our run-time code is not in position
to work correctly with them, or at least not with how things stand
today...

> However, perhaps run-time pruning should be tagging on
> PartitionPruneInfo to more than top-level Appends. Fixing the latter
> case, code-wise is about as simple as removing the "rel->reloptkind ==
> RELOPT_BASEREL &&" line from create_append_plan(). Certainly, if the
> outer Append hadn't been a single subpath Append, then we wouldn't
> have pulled up the lower-level Append, so perhaps we should be
> run-time pruning lower-level ones too.

While looking at this, I observed that the PartitionPruneInfo of the
top-level Append (the one that later gets thrown out) contains bogus
information:

{PARTITIONPRUNEINFO
:prune_infos ((
{PARTITIONEDRELPRUNEINFO
:rtindex 1
:present_parts (b 0)
:nparts 1
:subplan_map 0
:subpart_map 1

One of these should be -1.

{PARTITIONEDRELPRUNEINFO
:rtindex 2
:present_parts (b)
:nparts 2
:subplan_map -1 -1
:subpart_map -1 -1

subplan_map values are not correct, because subpaths list that would
have been passed would not include paths of lower-level partitions as
the flattening didn't occur.

))
:other_subplans (b)
}

I guess the problem is that we let an Append be nested, but don't
account for that in how partitioned_rels list it parent Append is
constructed. The top-level Append's partitioned_rels should not have
contained sub-partitioned table's RTI if it got its own Append. Maybe
if we want to make run-time pruning work with nested Appends, we need
to fix how partitioned_rels are gathered.

--
Amit Langote
EnterpriseDB: http://www.enterprisedb.com

Attachment Content-Type Size
avoid-partial-nested-append.patch application/octet-stream 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kashif Zeeshan 2020-04-17 07:08:14 Re: WIP/PoC for parallel backup
Previous Message Fujii Masao 2020-04-17 07:03:34 Re: Race condition in SyncRepGetSyncStandbysPriority