Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Michael Paquier <michael(dot)paquier(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath
Date: 2019-01-02 19:01:27
Message-ID: 038cf782-271d-942a-c6be-901b27d43f71@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 4/1/18 9:26 AM, David Rowley wrote:
> On 16 March 2018 at 04:01, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> I hadn't been paying much attention to this thread, but I've now taken
>> a quick look at the 2018-02-19 patch, and I've got to say I do not like
>> it much. The changes in createplan.c in particular seem like hack-and-
>> slash rather than anything principled or maintainable.
>
> Thanks for looking at this. I didn't manage to discover any other
> working solutions to when the Vars can be replaced. If we don't do
> this in createplan.c then it's going to cause problems in places such
> as apply_pathtarget_labeling_to_tlist, which is well before setrefs.c
> gets hold of the plan.
>
>> The core issue here is that Paths involving the appendrel and higher
>> will contain Vars referencing the appendrel's varno, whereas the child
>> is set up to emit Vars containing its own varno, and somewhere we've got
>> to match those up. I don't think though that this problem is exactly
>> specific to single-member Appends, and so I would rather we not invent a
>> solution that's specific to that. A nearly identical issue is getting
>> rid of no-op SubqueryScan nodes. I've long wished we could simply not
>> emit those in the first place, but it's really hard to do because of
>> the fact that Vars inside the subquery have different varnos from those
>> outside. (I've toyed with the idea of globally flattening the rangetable
>> before we start planning, not at the end, but haven't made it happen yet;
>> and anyway that would be only one step towards such a goal.)
>
> I'm not quite sure why you think the solution I came up with is
> specific to single-member Appends. The solution merely uses
> single-member Append paths as a proxy path for the solution which I've
> tried to make generic to any node type. For example, the patch also
> resolves the issue for MergeAppend, so certainly nothing in there is
> specific to single-member Appends. I could have made the proxy any
> other path type, it's just that you had suggested Append would be
> better than inventing ProxyPath, which is what I originally proposed.
>

I think a natural question is how the approach devised in this thread
(based on single-member Append paths) could be used to deal with no-op
Subquery nodes. I don't see an obvious way to do that, and I guess
that's what Tom meant by "specific to single-member Appends".

>> It might be worth looking at whether we couldn't fix the single-member-
>> Append issue the same way we fix no-op SubqueryScans, ie let setrefs.c
>> get rid of them. That's not the most beautiful solution perhaps, but
>> it'd be very localized and low-risk.
>
> It might be possible, but wouldn't that only solve 1 out of 2
> problems. The problem of the planner not generating the most optimal
> plan is ignored with this. For example, it does not make much sense to
> bolt a Materialize node on top of an IndexScan node in order to
> provide the IndexScan with mark/restore capabilities... They already
> allow that.
>

Yeah, I think we can't do all the magic in setrefs.c if we want the
decisions to affect plan shape in any significant way - the decision
whether an Append node has only a single node needs to happen while
constructing the path or shortly after it (before we build other paths
on top of it). And before costing the path.

So setrefs.c seems a too late for that :-(

I suppose this limitation was acceptable for no-op Subqueries, but I'm
not sure the same thing applies to single-member Appends.

>> In general setrefs.c is the right place to deal with variable-matching
>> issues. So even if you don't like that specific idea, it'd probably be
>> worth thinking about handling this by recording instructions telling
>> setrefs what to do, instead of actually doing it at earlier stages.
>
> From what I can see, setrefs.c is too late. ERRORs are generated
> before setrefs.c gets hold of the plan if we don't replace Vars.
>
> I'm not opposed to finding a better way to do this.
>

AFAICS the patch essentially does two things: (a) identifies Append
paths with a single member and (b) ensures the Vars are properly mapped
when the Append node is skipped when creating the plan.

I agree doing (a) in setrefs.c is too late, as mentioned earlier. But
perhaps we could do at least (b) to setrefs.c?

I'd say mapping the Vars is the most fragile and error-prone piece of
this patch. It's a bit annoying that it's inventing a new infrastructure
to do that, translates expressions in various places, establishes new
rules for tlists (having to call finalize_plan_tlist when calling
build_path_tlist before create_plan_recurse), etc.

But we already do such mappings (not exactly the same, but similar) for
other purposes - from setrefs.c. So why can't why do it the same way here?

FWIW I haven't tried to do any of that, and I haven't looked on this
patch for quite a long time, so perhaps I'm missing an obvious obstacle
preventing us from doing that ...

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2019-01-02 19:07:51 Re: Implicit make rules break test examples
Previous Message Tom Lane 2019-01-02 18:19:25 Re: Query planner / Analyse statistics bad estimate rows=1 with maximum statistics 10000 on PostgreSQL 10.2