Re: enable_incremental_sort changes query behavior

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enable_incremental_sort changes query behavior
Date: 2020-10-01 22:08:22
Message-ID: 20201001220822.rwwxajwzobljgqwz@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 01, 2020 at 09:02:57AM -0400, James Coleman wrote:
>On Thu, Oct 1, 2020 at 3:09 AM Jaime Casanova
><jaime(dot)casanova(at)2ndquadrant(dot)com> wrote:
>>
>> On Wed, 30 Sep 2020 at 21:21, James Coleman <jtc331(at)gmail(dot)com> wrote:
>> >
>> > On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova
>> > <jaime(dot)casanova(at)2ndquadrant(dot)com> wrote:
>> > >
>> > > Hi,
>> > >
>> > > With sqlsmith I found a query that gives this error:
>> > > ERROR: ORDER/GROUP BY expression not found in targetlist
>> > >
>> [...]
>> > >
>> > > But if I set enable_incremental_sort to off the query gets executed
>> > > without problems (attached the explain produced for that case)
>> >
>> > Thanks for the report.
>> >
>>
>> Hi,
>>
>> by experiment I reduced the query to this
>>
>> --- 0 ---
>> select distinct
>> subq_0.c1 as c0,
>> case when (true = pg_catalog.pg_rotate_logfile_old()) then
>> ref_0.t else ref_0.t
>> end
>> as c4
>> from
>> public.ref_0,
>> lateral (select
>>
>> ref_0.i as c1
>> from
>> generate_series(1, 100) as ref_1) as subq_0
>> --- 0 ---
>>
>> the only custom table already needed can be created with this commands:
>>
>> --- 0 ---
>> create table ref_0 as select repeat('abcde', (random() * 10)::integer)
>> t, random() * 1000 i from generate_series(1, 500000);
>> create index on ref_0 (i);
>> analyze ref_0 ;
>> --- 0 ---
>>
>>
>> > Is there by an chance an index on ref_0.radi_text_temp?
>> >
>>
>> there is an index involved but not on that field, commands above
>> create the index in the right column... after that, ANALYZE the table
>>
>> > And if you set enable_hashagg = off what plan do you get (or error)?
>> >
>>
>> same error
>
>I was able to reproduce the error without incremental sort enabled
>(i.e., it happens with a full sort also). The function call in the
>SELECT doesn't have to be in a case expression; for example I was able
>to reproduce changing that to `random()::text || ref_0.t`.
>
>It looks like the issue happens when:
>1. The sort happens within a parallel node.
>2. One of the sort keys is an expression containing a volatile
>function call and a column from the lateral join.
>
>Here are the settings I used with your above repro case to show it
>with regular sort:
>
> enable_hashagg=off
> enable_incremental_sort=off
> enable_seqscan=off
> parallel_setup_cost=10
> parallel_tuple_cost=0
>
>The plan (obtained by replacing the volatile function with a stable one):
>
> Unique
> -> Nested Loop
> -> Gather Merge
> Workers Planned: 2
> -> Sort
> Sort Key: ref_0.i, (md5(ref_0.t))
> -> Parallel Index Scan using ref_0_i_idx on ref_0
> -> Function Scan on generate_series ref_1
>
>Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error.
>
>I haven't been able to dig further than that yet, but my intuition is
>to poke around in the parallel query machinery?
>

Nope. Bisect says this was introduced by this commit:

ba3e76cc57 Consider Incremental Sort paths at additional places

Looking at the diff, I suspect generate_useful_gather_paths (or maybe
get_useful_pathkeys_for_relation) fails to do something with the
pathkeys.

Of course, that'd explain why it only happens for parallel plans, as
this builds gather nodes.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-10-01 22:21:27 Re: WIP: BRIN multi-range indexes
Previous Message Andres Freund 2020-10-01 20:59:54 Re: Improving connection scalability: GetSnapshotData()