Re: enable_incremental_sort changes query behavior

From: James Coleman <jtc331(at)gmail(dot)com>
To: Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enable_incremental_sort changes query behavior
Date: 2020-10-01 13:02:57
Message-ID: CAAaqYe9QBf_UWT+-_SrMs4iHKbUC_WkaHvJjthJuqrmvW2UdZw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Oct 1, 2020 at 3:09 AM Jaime Casanova
<jaime(dot)casanova(at)2ndquadrant(dot)com> wrote:
>
> On Wed, 30 Sep 2020 at 21:21, James Coleman <jtc331(at)gmail(dot)com> wrote:
> >
> > On Sat, Sep 26, 2020 at 2:49 PM Jaime Casanova
> > <jaime(dot)casanova(at)2ndquadrant(dot)com> wrote:
> > >
> > > Hi,
> > >
> > > With sqlsmith I found a query that gives this error:
> > > ERROR: ORDER/GROUP BY expression not found in targetlist
> > >
> [...]
> > >
> > > But if I set enable_incremental_sort to off the query gets executed
> > > without problems (attached the explain produced for that case)
> >
> > Thanks for the report.
> >
>
> Hi,
>
> by experiment I reduced the query to this
>
> --- 0 ---
> select distinct
> subq_0.c1 as c0,
> case when (true = pg_catalog.pg_rotate_logfile_old()) then
> ref_0.t else ref_0.t
> end
> as c4
> from
> public.ref_0,
> lateral (select
>
> ref_0.i as c1
> from
> generate_series(1, 100) as ref_1) as subq_0
> --- 0 ---
>
> the only custom table already needed can be created with this commands:
>
> --- 0 ---
> create table ref_0 as select repeat('abcde', (random() * 10)::integer)
> t, random() * 1000 i from generate_series(1, 500000);
> create index on ref_0 (i);
> analyze ref_0 ;
> --- 0 ---
>
>
> > Is there by an chance an index on ref_0.radi_text_temp?
> >
>
> there is an index involved but not on that field, commands above
> create the index in the right column... after that, ANALYZE the table
>
> > And if you set enable_hashagg = off what plan do you get (or error)?
> >
>
> same error

I was able to reproduce the error without incremental sort enabled
(i.e., it happens with a full sort also). The function call in the
SELECT doesn't have to be in a case expression; for example I was able
to reproduce changing that to `random()::text || ref_0.t`.

It looks like the issue happens when:
1. The sort happens within a parallel node.
2. One of the sort keys is an expression containing a volatile
function call and a column from the lateral join.

Here are the settings I used with your above repro case to show it
with regular sort:

enable_hashagg=off
enable_incremental_sort=off
enable_seqscan=off
parallel_setup_cost=10
parallel_tuple_cost=0

The plan (obtained by replacing the volatile function with a stable one):

Unique
-> Nested Loop
-> Gather Merge
Workers Planned: 2
-> Sort
Sort Key: ref_0.i, (md5(ref_0.t))
-> Parallel Index Scan using ref_0_i_idx on ref_0
-> Function Scan on generate_series ref_1

Changing `md5(ref_0.t)` to `random()::text || ref_0.t` causes the error.

I haven't been able to dig further than that yet, but my intuition is
to poke around in the parallel query machinery?

James

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-10-01 13:42:52 Re: Why does PostgresNode.pm set such a low value of max_wal_senders?
Previous Message k.jamison@fujitsu.com 2020-10-01 12:55:34 RE: [Patch] Optimize dropping of relation buffers using dlist