Re: Parallel Append can break run-time partition pruning

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Robert Haas <robert(dot)haas(at)enterprisedb(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Parallel Append can break run-time partition pruning
Date: 2020-04-25 11:24:48
Message-ID: CAApHDvpd2kueBzvtanGbgoWLXBTR-PBZBRCXdj2_npeS0guwTg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 23 Apr 2020 at 22:37, David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
> For now. I'd still like to get a bit more confidence that the only
> noticeable change in the outcome here is that we're now pulling up
> sub-Appends in all cases. I've read the code a number of times and
> just can't quite see any room for anything changing. My tests per
> Robert's case all matched what the previous version did, but I'm still
> only about 93% on this. Given that I'm aiming to fix a bug in master,
> v11 and v12 here, I need to get that confidence level up to about the
> 100% mark.

I looked at this again and I don't think what I've got is right.

The situation that I'm concerned about is that we generate a Parallel
Append path for a sub-partitioned table and that path has a mix of
partial and parallel safe paths. When we perform
add_paths_to_append_rel() for the top-level partition, if for some
reason we didn't do a Parallel Append, then we could pullup the mixed
set of partial and parallel safe paths from the child-Append. We can't
execute a parallel_safe subpath under a normal Append that's below a
Gather as multiple workers could execute the same parallel safe scan,
which won't lead to anything good, probably wrong results at best.
Now, we'll only ever have a Parallel Append for the sub-partitioned
table if enable_parallel_append is on and the rel's consider_parallel
is switched on too, but if for some reason the top-level partitioned
table had consider_parallel set to off, then we could end up pulling
up the subpaths from a Parallel Append path, which could contain a mix
of partial and parallel safe paths and we'd then throw those into a
non-Parallel Append partial path! The parallel safe path just can't
go in one of those as multiple workers could then execute that path.
Only Parallel Append knows how to execute those safely as it ensures
only 1 worker touches it.

Now, looking at the code today, it does seem that a child rel will do
parallel only if the parent rel does, but I don't think it's a great
idea to assume that will never change. Additionally, it seems fragile
to also assume the value of the enable_parallel_append, a global
variable would stay the same between calls too.

I wonder if we could fix this by when we call
add_paths_to_append_rel() for the top-level partitioned table, just
recursively get all the child rels and their children too.
accumulate_append_subpath() would then never see Appends or
MergeAppends as we'd just be dealing with scan type Paths.
add_paths_to_append_rel() for sub-partitioned tables wouldn't have to
do very much then. I'll need to see how that idea would fit in with
partition-wise joins. My guess is, not very well. There's also
apply_scanjoin_target_to_paths() which calls add_paths_to_append_rel()
too.

The only other idea I have so far is just to either not generate the
partial path using the partial_subpaths list in
add_paths_to_append_rel() when pa_subpaths_valid and
pa_nonpartial_subpaths != NIL. i.e only create the first of those
partial paths if we're not going to create the 2nd one. Or even just
swap the order that we add_partial_path() them so that
add_partial_path() does not reject the path with the flattened
Appends.

David

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-04-25 12:40:24 Re: Binary search in ScalarArrayOpExpr for OR'd constant arrays
Previous Message Juan José Santamaría Flecha 2020-04-25 09:05:35 Re: Anybody want to check for Windows timezone updates?