Re: Removing [Merge]Append nodes which contain a single subpath

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Removing [Merge]Append nodes which contain a single subpath
Date: 2017-10-31 13:47:32
Message-ID: CAKJS1f9eHBEykHLsvN-Tn5SwH+GBJzWmrSq=qS1S7rw9c+mvkw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 27 October 2017 at 04:47, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> David Rowley <david(dot)rowley(at)2ndquadrant(dot)com> writes:
>> It seems like a good idea for the planner not to generate Append or
>> MergeAppend paths when the node contains just a single subpath.
>> ...
>> Perhaps there is also a "Method 3" which I've not thought about which
>> would be much cleaner.
>
> Have you considered turning AppendPath with a single child into more
> of a special case? I think this is morally equivalent to your ProxyPath
> proposal, but it would take less new boilerplate code to support.
> If you taught create_append_path to inherit the child's pathkeys
> when there's only one child, and taught createplan.c to skip making the
> useless Append node, I think you might be nearly done.

That seems reasonable, although after going and implementing this, it
does not seem quite as convenient as dummy paths are. I had to make
the AppendPath carries a bit more weight than I originally thought.
It now stores translate to/from expressions and also a single bool
(isproxy) to mark if it's a proxy type Append. I tried to do without
this bool and just check if there's a single subpath and some
translation expressions, but that was getting fooled by Paths which
had empty targetlists which meant there were no translations, and
going on the single subpath alone is no good as do we create append
paths more places than just add_paths_to_append_rel().

I've attached a patch which goes and implements all this. I think it
still needs a bit of work. I'm not 100% confident I'm translating
expressions in all the required places in createplan.c. I did modify
tenk1 to become the parent of a single partition and ran the
regression tests. There appear to be 4 subtle differences to querying
a parent with an only-child vs querying the child directly.

1. EXPLAIN VERBOSE shows the columns prefixed by the table name. This
is caused by "useprefix = list_length(es->rtable) > 1;" in explain.c.
2. There's a small change in an error message in portals.out. "ERROR:
cursor "c" is not a simply updatable scan of table "tenk1a"" now
mentions the child table instead of the parent. Nothing to worry about
as it appears to already do this with native partitioning.
3. Some plans are parallelized that were not before (see join.out).
This is caused by compute_parallel_worker() having a special case for
non-baserels.
4. Some gating plans don't work the same as they used to, this is
highlighted in aggregates.out and is caused by child rel Const TRUE
baserestrictinfo clauses being dropped in set_append_rel_size.

For differences 2-4, I've attached another file which modifies tenk1
to become a partitioned table with a single partition. I've modified
all the expected results for #1 to reduce the noise, so 3 tests are
still failing for reasons 2-4. I'm not at all concerned with #2. #3
might not be worth fixing, I've just no idea how yet and #4 seems like
it's on purpose.

Other things I'm not 100% happy with are; in prepunion.c I had to
invent a new function named find_all_appinfos_for_child, which is a
bit similar to adjust_appendrel_attrs_multilevel, only it modifies the
attributes, but the use I have for find_all_appinfos_for_child
includes also calling add_child_rel_equivalences for the children at
each level. Probably a bit more thinking will come up with some way to
modify the existing function sets to be more reusable, I've just not
done that yet.

I decided that in createplan.c to have a generic expression translator
rather than something that only understands AppendRelInfos. The reason
for this is two-fold:

1. There may be multiple AppendRelInfos required for a single
translation between a parent and child if there are multiple other
children in between. Seems wasteful to have intermediate translations.
2. I have other uses in mind for this feature that are not Append
child relations.

A sanity check on what I have would be most welcome if anyone has time.

I'm going to add this to the November commitfest.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachment Content-Type Size
remove_singleton_appends_2017-10-31.patch application/octet-stream 47.5 KB
remove_singleton_appends_examples_of_differences_2017-10-31.patch application/octet-stream 82.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-10-31 13:59:33 Re: Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM
Previous Message Jan Przemysław Wójcik 2017-10-31 13:02:27 Re: pg_trgm word_similarity inconsistencies or bug