Re: enable_incremental_sort changes query behavior

From: James Coleman <jtc331(at)gmail(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Jaime Casanova <jaime(dot)casanova(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: enable_incremental_sort changes query behavior
Date: 2020-11-25 20:37:29
Message-ID: CAAaqYe_RaBHV05cExUjCYzidyLZzT+3fVVfVNLUkw0bQpVHnug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Nov 25, 2020 at 2:53 PM James Coleman <jtc331(at)gmail(dot)com> wrote:
>
> On Tue, Nov 24, 2020 at 2:31 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> >
> > James Coleman <jtc331(at)gmail(dot)com> writes:
> > > On Mon, Nov 23, 2020 at 2:24 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > >> 1. James was wondering, far upthread, why we would do projections
> > >> pre-sort or post-sort. I think the answers can be found by studying
> > >> planner.c's make_sort_input_target(), which separates out what we want
> > >> to do pre-sort and post-sort.
> >
> > > Does it imply we need to intentionally avoid SRFs also?
> >
> > It's sort of a wart that we allow SRFs in ORDER BY at all, but my
> > expectation is that make_sort_input_target would prevent lower levels
> > of the planner from needing to think about that. We don't allow SRFs
> > in index expressions, nor in WHERE clauses (so they'll never come up
> > as mergejoin sort keys). So really there's no way that scan/join
> > processing should need to consider such cases. Such a sort would
> > only ever be implemented via a final Sort atop ProjectSet.
>
> In this case though we're sorting based on "interesting" pathkeys,
> which means we don't don't necessarily have index expressions or
> mergejoin sort keys. For example, even with the bugfix patch (from the
> parallel fix thread I've linked to previously) applied, I'm able to
> generate this (with of course some GUCs "tweaked"):
>
> select unique1 from tenk1 order by unnest('{1}'::int[]);
>
> Gather Merge
> Workers Planned: 2
> -> Sort
> Sort Key: (unnest('{1}'::integer[]))
> -> ProjectSet
> -> Parallel Index Only Scan using tenk1_unique1 on tenk1
>
> If I understand correctly, that's a violation of the spirit of what
> the comments above make_sort_input_target(). If that's true, then it
> should be pretty easy to disallow them from being considered. Given
> the existing restrictions on where SRFs can be placed in a SELECT
> (e.g., no CASE/COALESCE) and my assumption (without having thoroughly
> verified this) that SRFs aren't allowed as arguments to functions or
> as arguments to any other expression (I assume only scalars are
> allowed), would it be sufficient to check the pathkey expression
> (without recursion) to see if it's a FuncExpr that returns a set?

Here's the plan with a change to restrict SRFs here:

Sort
Sort Key: (unnest('{1,2}'::integer[]))
-> Gather
Workers Planned: 2
-> ProjectSet
-> Parallel Index Only Scan using tenk1_unique1 on tenk1

I'm a bit surprised the ProjectSet is above the Index Scan rather than
above the Gather, but maybe this is still more correct?

James

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Victor Yegorov 2020-11-25 21:20:44 Re: Deleting older versions in unique indexes to avoid page splits
Previous Message David Zhang 2020-11-25 20:13:55 Re: Add table access method as an option to pgbench