From: | David Rowley <dgrowleyml(at)gmail(dot)com> |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com> |
Cc: | Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Khandekar <amitdkhan(dot)pg(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Parallel Append can break run-time partition pruning |
Date: | 2020-10-25 01:06:38 |
Message-ID: | CAApHDvqQR7i6feyEvFaE_gr2UgX0=BNHW63tyHHZrsvu5UEYrA@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 16 Oct 2020 at 03:01, Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>
> Going over the last few emails, it seems that David held off from
> committing the patch, because of the lack of confidence in its
> robustness. With the patch, a sub-partitioned child's partial
> Append's child paths will *always* be pulled up into the parent's set
> of partial child paths thus preventing the nesting of Appends, which
> the run-time pruning code can't currently cope with. The lack of
> confidence seems to be due to the fact that always pulling up a
> sub-Append's child paths into the parent partial Append's child paths
> *may* cause the latter to behave wrongly under parallelism and the new
> code structure will prevent add_partial_path() from being the
> arbitrator of whether such a path is really the best in terms of cost.
>
> If we can't be confident in that approach, maybe we should consider
> making the run-time pruning code cope with nested Appends?
I've been thinking about this and my thoughts are:
There are other cases where we don't pullup sub-Merge Append nodes
anyway, so I think we should just make run-time pruning work for more
than just the top-level Append/Merge Append.
The case I'm thinking about is the code added in 959d00e9dbe for
ordered Append scans. It's possible a sub-partitioned table has
partitions which don't participate in the same ordering. We need to
keep the sub-Merge Append in those cases... well, at least when
there's more than 1 partition remaining after plan-time pruning.
I've attached the patch which, pending a final look, I'm proposing to
commit for master only. I just don't quite think this is a bug fix,
and certainly, due to the invasiveness of the proposed patch, that
means no backpatch.
I fixed all the stuff about the incorrectly set partitioned_rels list.
What I ended up with there is making it accumulate_append_subpath's
job to also pullup the sub-paths partitioned_rels fields when pulling
up a nested Append/MergeAppend. If there's no pullup, there then we
don't care about the sub-path's partitioned_rels. That's for it to
deal with. With that, I think that run-time pruning will only get the
RT indexes for partitions that we actually have sub-paths for. That's
pretty good, so I added an Assert() to verify that in
make_partitionedrel_pruneinfo(). (I hope I don't regret that later)
This does mean we need to maintain a different partitioned_rels list
for each Append path we consider. So there's a number (six) of these
now between add_paths_to_append_rel() and
generate_orderedappend_paths(). To try to minimise the impact of
that, I've changed the field so that instead of being a List of
IntLists, it's just a List of Relids. The top-level list just
contains a single element until you get a UNION ALL that selects from
a partitioned table on each side of the union. Merging sub-path
partitioned rel RT indexes into the existing element is now pretty
cheap as it just uses bms_add_members() rather the list_concat we'd
have had to have used if it was still a List of IntLists.
After fixing up how partitioned_rels is built, I saw there were no
usages of RelOptInfo.partitioned_child_rels, so I got rid of it.
I did another couple of little cleanups and wrote some regression
tests to test all this.
Overall I'm fairly happy with this, especially getting rid of a
partitioned table related field from RelOptInfo.
David
Attachment | Content-Type | Size |
---|---|---|
v1-0001-Allow-run-time-pruning-on-nested-Append-MergeAppe.patch | text/plain | 36.3 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Hou, Zhijie | 2020-10-25 01:22:55 | Fix typo in src/backend/utils/cache/lsyscache.c |
Previous Message | Justin Pryzby | 2020-10-24 19:59:49 | Re: pg_dump, ATTACH, and independently restorable child partitions |