Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
Date: 2018-02-19 05:01:12
Message-ID: CAKJS1f-+i1Cv=NXGBVGnb2FVBCKXd2ny_HeuLK8s1b-WNOOKxw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 19 February 2018 at 15:11, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
> 1) I can confirm that it indeed eliminates the Append overhead, using
> the example from [1], modified to use table with a single partition. Of
> course, this is a pretty formal check, because the patch simply removes
> the Append node and so it'd be utterly broken if the overhead did not
> disappear too.
>
> [1]
> https://www.postgresql.org/message-id/CAKJS1f9UXdk6ZYyqbJnjFO9a9hyHKGW7B%3DZRh-rxy9qxfPA5Gw%40mail.gmail.com

Thanks for testing that.

> 2) If I had to guess where the bugs will be, my guess would be a missing
> finalize_plan_tlist call - not necessarily in existing code, but perhaps
> in future improvements.

I agree with that. I'd say though that additionally for the future
that we might end up with some missing translation calls to
replace_translatable_exprs().

To try to ensure I didn't miss anything, I did previously modify the
regression test tables "tenk" and "tenk1" to become partitioned tables
with a single partition which allowed all possible values to try to
ensure I'd not missed anywhere. I just failed to find a reasonable way
to make this a more permanent test without enforcing that change on
the tables as a permanent change.

> I wonder if we could automate this call somehow, say by detecting when
> create_plan_recurse is called after build_path_tlist (for a given path),
> and if needed calling finalize_plan_tlist from create_plan_recurse.

Sounds nice, but I'm unsure how we could do that.

> Although, now that I look at it, promote_child_relation may be doing
> something like that by adding stuff to the translate_* variables.

Do you mean this?

/*
* Record this child as having been promoted. Some places treat child
* relations in a special way, and this will give them a VIP ticket to
* adulthood, where required.
*/
root->translated_childrelids =
bms_add_members(root->translated_childrelids, child->relids);

That's to get around a problem in use_physical_tlist() where there's
different behaviour for RELOPT_OTHER_MEMBER_REL than there is for
RELOPT_BASEREL. Another option might be to instead change the
RELOPT_OTHER_MEMBER_REL into RELOPT_BASEREL, although I was a little
too scared that might result in many other areas requiring being
changed. I may be wrong about that though. We do, for example,
occasionally change a RELOPT_BASEREL into a RELOPT_DEADREL, so
RelOptInfos changing their RelOptKind is not a new concept, so perhaps
it's fine.

> I'm
> not quite sure why we need to append to the lists, though?

Multiple Append rels may be replaced by their only-child relation, so
we'd need to be able to translate the targetlists for both. For
example, joins above the Appends may contain Vars which belong to
either of the Appends and those need to be translated into the
promoted child relation's Vars.

> 3) I'm not quite sure what replace_translatable_exprs does, or more
> precisely why do we actually need it. At this point the comment pretty
> much says "We need to do translation. This function does translation."
> Perhaps it should mention why removing/skipping the AppendPath implies
> the need for translation or something?

It does mention "Expr translation is required to translate the
targetlists of nodes above the Append", but perhaps I can make that a
bit more clear.

Let's say we have a query such as:

SELECT * FROM fact f INNER JOIN dim1 d ON f.dim1_id = d.id WHERE
f.date BETWEEN '2018-01-01' AND '2018-01-31';

Which results in a hash join and a single Jan 2018 partition being
scanned for "fact". A plan without the Append node having been removed
would have the Append targetlist containing the Vars from the "fact"
table, as would the Hash Join node... The problem this function is
trying to solve is translating the Vars in the Hash Join node (or any
node above the Append) to change the "fact" Vars into "fact_jan2018"
Vars. In create_plan.c we skip over any isproxy Append paths and
instead return the recursive call to the single proxy-path. Without
the translation we'd run into problems when trying to find Vars in the
targetlists later.

> 4) The thread talks about "[Merge]Append" but the patch apparently only
> tweaks create_append_path and does absolutely nothing to "merge" case.
> While looking into why, I notices that generate_mergeappend_paths always
> gets all_child_pathkeys=NULL in this case (with single subpath), and so
> we never create the MergePath at all. I suppose there's some condition
> somewhere responsible for doing that, but I haven't found it ...

Yeah, only Append paths are used as proxy paths. The confusion here is
that the path creation logic has also been changed in allpaths.c (see
generate_proxy_paths()), so that we no longer build MergeAppend paths
when there's a single subpath. It's probably just the fact that Append
is being hi-jacked to act as ProxyPath that's causing the confusion
there. If you look over generate_proxy_paths and where it gets called
from, you'll see that we, instead of creating Append/MergeAppend
paths, we just add each path from pathlist and partial_pathlist from
the only-child subpath to the Append rel. This what gives the planner
the same flexibility to generate a plan as if the only child partition
had been written the query, instead of the parent.

> 5) The comment before AppendPath typedef says this:
>
> * An AppendPath with a single subpath can be set up to become a "proxy"
> * path. This allows a Path which belongs to one relation to be added to
> * the pathlist of some other relation. This is intended as generic
> * infrastructure, but its primary use case is to allow Appends with
> * only a single subpath to be removed from the final plan.
>
> I'm not quite sure how adding a 'isproxy' flag into AppendPath node
> makes this a generic infrastructure? Wouldn't an explicit ProxyPath be a
> better way to achieve that?

It's supposed to be generic in the sense that I didn't aim to code it
just to allow Appends with a single subpath to build a plan with the
single subpath instead of the Append. I'll give an example a bit
below.

It evolved that way due to an email early on in this thread. I
proposed 2 ways to do this, which included ProxyPath as 1 option. Tom
proposed hi-jacking Append to save on the additional boilerplate code.
At the time to implement that, I imagined that I could have gotten
away with just adding the two translation Lists to the AppendPath
struct, and have code determine it's a proxy rather than an Append by
the fact that the Lists are non-empty, but that fails due to lack of
anything to translate for SELECT COUNT(*) FROM ... type queries which
have empty target lists.

I don't have a huge preference as to which gets used here, but I don't
want to change it just to have to change it back again later.
Although, I'm not ruling out the fact that Tom might change his mind
when he sees that I had to add an "isproxy" flag to AppendPath to get
the whole thing to work. It's certainly not quite as convenient as how
dummy paths work.

> What other proxy paths do you envision, and why should they be
> represented as AppendPath?

For example:

CREATE MATERIALIZED VIEW mytable_count AS SELECT COUNT(*) FROM mytable;

which would allow queries such as:

SELECT COUNT(*) FROM mytable;

to have an isproxy AppendPath added to the pathlist for the RelOptInfo
for mytable which just performs a seqscan on mytable_count instead. It
would likely come out a much cheaper Path. The translation exprs, in
this case would translate the COUNT(*) Aggref into the Var belonging
to mytable_count.

Of course, there are other unrelated reasons why that particular
example can't work yet, but that was just an example of why I tried to
keep it generic.

> FWIW one advantage of a separate ProxyPath would be that it would be a
> much clearer breakage for third-party code (say, extensions tweaking
> paths using hooks), either at compile or execution time. With just a new
> flag in existing node, that may easily slip through. On the other hand
> no one promises the structures to be stable API ...

hmm true, but you could probably have said that for overloading an Agg
node to become Partial Aggregate and Finalize Aggregate nodes too.

> 6) I suggest we add this assert to create_append_path:
>
> Assert(!isproxy || (isproxy && (list_length(subpaths)==1)));

That kind of indirectly exists already as two seperate Asserts(),
although, what I have is equivalent to:

Assert((!isproxy && translate_from == NIL) || (isproxy &&
(list_length(subpaths)==1)));

... as two separate Asserts(), which, if I'm not mistaken, is slightly
more strict than the one you mentioned.

> and perhaps we should do s/isproxy/is_proxy/ which seems like the usual
> naming for boolean variables.

You're right. I'll change that and post an updated patch. I'll also
reprocess your email again and try to improve some comments to make
the intent of the code more clear.

Thanks for the review!

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2018-02-19 05:50:48 Re: Server Crash while executing pg_replication_slot_advance (second time)
Previous Message Amit Langote 2018-02-19 04:59:28 Re: [HACKERS] path toward faster partition pruning