Re: enable_incremental_sort changes query behavior

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enable_incremental_sort changes query behavior
Date: 2020-10-07 22:22:43
Message-ID: 20201007222243.zr6guay2smhopkcq@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Oct 07, 2020 at 04:01:27PM -0400, James Coleman wrote:
>On Wed, Oct 7, 2020 at 3:52 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>>
>> On Thu, Oct 1, 2020 at 9:03 AM James Coleman <jtc331(at)gmail(dot)com> wrote:
>> > The plan (obtained by replacing the volatile function with a stable one):
>> >
>> > Unique
>> > -> Nested Loop
>> > -> Gather Merge
>> > Workers Planned: 2
>> > -> Sort
>> > Sort Key: ref_0.i, (md5(ref_0.t))
>> > -> Parallel Index Scan using ref_0_i_idx on ref_0
>> > -> Function Scan on generate_series ref_1
>> >
>> > Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error.
>>
>> As far as I remember this stuff, target lists for any nodes below the
>> top of the join tree were previously always just Var nodes. Any
>> expressions that needed to be computed were computed at the level of
>> the topmost join, before adding upper planner steps like Agg,
>> WindowAgg, Limit, etc. But, here you've got a complex expression --
>> md5(ref_0.t) -- begin computed somewhere in the middle of the plan
>> tree so that the sort can be performed at that point. However, that
>> seems likely to break a bunch of assumptions all over the planner,
>> because, unless a lot of work and surgery has been done since I looked
>> at this last, that md5(ref_0.t) value is not going to "bubble up" to
>> higher levels of the plan through the tlists of the individual nodes.
>> That poses a risk that it will be recomputed multiple times, which is
>> particularly bad for volatile functions because you might not always
>> get the same answer, but it seems like it might be bad even for
>> non-volatile functions, because it could be slow if the value is used
>> at multiple places in the query. It feels like the sort of thing that
>> might have broken some other assumptions, too, although I can't say
>> exactly what.
>
>The distinction between volatile and stable expressions here has been
>discussed at length in this thread; I think it'd be worth reading
>through the rest of the thread before offering sledgehammer ideas
>below like reverting the entire piece.
>

Um, that's a tad too harsh, I guess ...

We've discussed the volatile/stable expressions before, but AFAIK we
failed to notice the planner assumes such expressions are expected to be
computed only in the upper parts of the plan. If this assumption really
exists, it probably relies on simply calling create_sort_path just about
right, and the new code looking at useful pathkeys below gather merge
might be breaking that simply by doing it too early.

In that case we can either rip this code out, effectively reverting most
of ba3e76cc571, or consider only pathkeys that we can find in the tlist
of the child path.

I'm still not entirely sure I understand what's happening, or what the
exact rule is. Consider this query:

explain (verbose) select distinct i, t, md5(t) from ref_0;

which on PG12 (i.e. before incremental sort) is planned like this:

QUERY PLAN
----------------------------------------------------------------------------------
Unique (cost=78120.92..83120.92 rows=500000 width=65)
Output: i, t, (md5(t))
-> Sort (cost=78120.92..79370.92 rows=500000 width=65)
Output: i, t, (md5(t))
Sort Key: ref_0.i, ref_0.t, (md5(ref_0.t))
-> Seq Scan on public.ref_0 (cost=0.00..10282.00 rows=500000 width=65)
Output: i, t, md5(t)
(7 rows)

i.e. the (stable) function is pushed all the way to the scan node. And
even if we replace it with a volatile expression it gets pushed down:

explain (verbose) select distinct i, t, md5(random()::text || t) from ref_0;

QUERY PLAN
----------------------------------------------------------------------------------
Unique (cost=83120.92..88120.92 rows=500000 width=65)
Output: i, t, (md5(((random())::text || t)))
-> Sort (cost=83120.92..84370.92 rows=500000 width=65)
Output: i, t, (md5(((random())::text || t)))
Sort Key: ref_0.i, ref_0.t, (md5(((random())::text || ref_0.t)))
-> Seq Scan on public.ref_0 (cost=0.00..15282.00 rows=500000 width=65)
Output: i, t, md5(((random())::text || t))
(7 rows)

But perhaps I just don't understand the assumption correctly?

>In particular, most of the discussion in the thread has been about
>what is actually safe to reference at lower levels in the plan. It
>would be extremely easy tomake this only allow sorting by Vars at
>lower levels in the plan, though the current revision of the patch
>follows what prepare_sort_from_pathkeys allows and disallows.
>
>Obviously it'd be nice to be able to sort by non-Var expressions at
>lower levels too. But again, it'd be easy to change that if it's not
>safe, so it'd be helpful if you could point to areas that don't bubble
>non-Vars up so that we can actually comment about that in the source
>and discuss specifics.
>

I think the proble here might be "what we think is theoretically safe"
vs. "what the planner expects". I wouldn't be surprised if this was safe
in theoretical sense, but planner not expecting that.

>> I wonder if whatever part of the incremental sort patch allowed
>> computation of tlist expressions below the top level of the join tree
>> ought to just be reverted outright, but that might be overkill. I
>> can't say that I really understand exactly what's going on here, but
>> it seems like a pretty significant behavior change to make as part of
>> another project, and I wonder if we're going to keep finding other
>> problems with it.
>

It certainly was not my intention to make such change in this patch. I
don't think the earlier parts of the incremental sort have this issue
because all those places were already constructing regular sort before.

But this code is different because it considers adding sort node (both
regular and incremental) below a gather merge, which was not happening
before. I suspect this is the problem, somehow.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-10-07 22:34:38 Re: enable_incremental_sort changes query behavior
Previous Message Tom Lane 2020-10-07 22:22:04 Re: Recent failures on buildfarm member hornet