Re: enable_incremental_sort changes query behavior

From: James Coleman <jtc331(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enable_incremental_sort changes query behavior
Date: 2020-11-20 21:30:47
Message-ID: CAAaqYe-KnXwKLYN+PQDuq+eZhP2zjQw0Rb7E-AMpjVcmuRARSQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Thanks much for the detailed followup; this is super helpful.

On Fri, Nov 20, 2020 at 2:57 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Fri, Nov 20, 2020 at 1:51 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
> > > This isn't a counterexample, because there's no join tree here -- or,
> > > well, there is, but it's trivial, because there's only one relation
> > > involved. You can't have a non-Var expression computed before you
> > > finish all the joins, because there are no joins.
> > >
> > > What I said was: "target lists for any nodes below the top of the join
> > > tree were previously always just Var nodes." The topmost join allowed
> > > non-Var nodes before, but not lower levels.
> >
> > As I understand what you're saying, the attached (from the repro case
> > in [1]'s discussion about parallel safety here) is a counterexample.
> >
> > Specifically we have a plan like:
> >
> > Merge Right Join
> > -> Unique
> > -> Gather Merge
> > -> Sort
> > -> Nested Loop
> >
> > The pathtarget of the nested loop contains non-var expressions (in
> > this case a CASE expression).
>
> Well, in this case there are two join trees, one for the subquery and
> one for the outer query. The expressions for the subquery are computed
> at the top of the plan tree for that subquery.
>
> I guess I didn't specify before that I meant "at the top of the join
> tree for a subquery level," but I did mean that. On principle, the
> planner can't very realistically postpone evaluating a subquery's
> expressions until the top of the outer query's join tree, because, as
> in your example here, the subquery might contain an ORDER BY or
> DISTINCT clause. And anyway, if you look at the planner code, you'll
> see that every subquery is basically planned separately, unless it
> gets flattened into the parent query, so even if it were possible to
> postpone expression evaluation to outer subquery levels in some case,
> the current code structure definitely would fail to achieve that goal.

Ah, the "for a subquery level" clarification is helpful.

> However, apart from incremental sort, you can postpone evaluating a
> join tree's expressions until you reach the top of the join tree, and
> I think that's what we have always done in the past. Each baserel in
> the join tree gets a target list containing a list of vars which
> corresponds to its column list, and the joinrels get lists of vars
> which correspond to the columns from the inputs are needed at higher
> levels of the plan tree. At the top of the join tree, we stick in the
> expressions, replacing the target list of the top node in the plan
> tree; the expressions have to be computable from the inputs because
> the inputs contain all the columns that we figured out were needed at
> this level at the beginning of planning. But, if there are scans or
> joins below the top level of the plan tree, then do they bubble up to
> higher levels of the plan? Should they? It's pretty complicated.
> Theoretically if I've computed length(somelongvar) then perhaps I can
> just bubble the computed value up the plan tree, rather than the
> original column, which might be cheaper. But that only works if the
> higher levels only need length(somelongvar) and not somelongvar
> itself.

Apologies for some rambling/overlapping thoughts below; I'm try to
absorb all of this.

A join rel (at the top of a subquery join tree) would have to somehow
know that it needs to pass up both length(somelongvar) and somelongvar
right? Since both might be needed by a higher level?

If I understand correctly you're saying that:
1. Target list entries for join rels contain vars and non-var expressions
2. Target list entries for base rels contain only vars
3. By implication the planner knows how to propagate non-var
expression results up from join rels, but not from base rels.

Point 3 is the one that would surprise me, or at least, I'd be
interested in knowing what makes the two different in that case (is it
an artificial restriction, a result of how things are structured in
code, something else?). I feel like I'm still missing context on why
this is inherently a problem.

Does this mean a simple "SELECT bar * 2 FROM foo" ends up with a join
rel at the top? Or that a projection handles it (maybe this is the
wrong question -- perhaps the project results in a join rel? I don't
know enough here to tell if this ends up being an odd question)? And
if a projection, then would inserting a projection above a base rel
but before the top of the join tree solve the problem while allowing
us to push expression evaluation down?

Perhaps the key (to what I'm missing) is just saying something like:
"the work hasn't been down to push expression evaluation down and
ensure it remains in the target lists at every level of the join tree
so as to be propagated up, and so adding it to the base rel will
almost certainly mean duplicate evaluation". That still leaves my
question about how propagating it up from a subquery join tree works
(maybe that's an already handled "special" case).

> I don't think there's any logic to work stuff like this out,
> unless the incremental sort patch added some, because I don't think we
> ever did it before. And it may be that it doesn't really matter, at
> least when there aren't volatile functions: perhaps the worst case is
> that we evaluate a non-volatile function twice, and maybe that's just
> a performance consequence that we can live with. But on the other
> hand, maybe it breaks stuff.

It seems to me that (given the new restriction on volatile
expressions) it'd likely not be the end of the world from the
unnecessary execution perspective; after all, we already do lots of
duplicate expression evaluation in other areas like filter clauses.

> Or, on the third hand, maybe I'm wrong about what the rule was in the
> first place. This is certainly a complicated topic. I don't think I'm
> wrong, or I wouldn't be bothering to write emails about it, but that
> doesn't mean I'm not wrong anyway.

Well, I for one definitely didn't understand what you were trying to
point out earlier, so I appreciate your continuing the discussion.

It seems to me a (likely?) outcome of this is requiring that the
expressions be in the target list to be considered for sorting here.
And indeed that fixes the non-parallel subplan issue we're also
looking at.

But I think that still leaves something missing: after all,
prepare_sort_from_pathkeys does know how to insert new target list
entries, so something either there (or in the caller/how its called)
has to be enforcing an apparently implicit rule about what point in
the tree it's safe to do such. Or even just no path generation had
ever considered it before (that feels unsatisfactory as an explanation
to me, because it feels more implicit than I'd like, but...)

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Rijkers 2020-11-20 21:52:32 Re: Additional Chapter for Tutorial - arch-dev.sgml
Previous Message Peter Geoghegan 2020-11-20 21:21:42 Re: xid wraparound danger due to INDEX_CLEANUP false