Re: [HACKERS] [PATCH] Incremental sort

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Alexander Kuzmenkov <a(dot)kuzmenkov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Teodor Sigaev <teodor(at)sigaev(dot)ru>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Incremental sort
Date: 2018-04-03 09:09:24
Message-ID: CAPpHfdudiZ5OtJs5cNse7q5ZKs15iMM1mZajVdWVhJJvTy7iug@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 1, 2018 at 12:06 AM, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On 03/31/2018 10:43 PM, Tomas Vondra wrote:
> > ...
> > But I'm pretty sure it may lead to surprising behavior - for example if
> > you disable incremental sorts (enable_incrementalsort=off), the plan
> > will switch to plain sort without the additional costs. So you'll get a
> > cheaper plan by disabling some operation. That's surprising.
> >
>
> To illustrate this is a valid issue, consider this trivial example:
>
> create table t (a int, b int, c int);
>
> insert into t select 10*random(), 10*random(), 10*random()
> from generate_series(1,1000000) s(i);
>
> analyze t;
>
> explain select * from (select * from t order by a,b) foo order by a,b,c;
>
> QUERY PLAN
> ------------------------------------------------------------------------
> Incremental Sort (cost=133100.48..264139.27 rows=1000000 width=12)
> Sort Key: t.a, t.b, t.c
> Presorted Key: t.a, t.b
> -> Sort (cost=132154.34..134654.34 rows=1000000 width=12)
> Sort Key: t.a, t.b
> -> Seq Scan on t (cost=0.00..15406.00 rows=1000000 width=12)
> (6 rows)
>
> set enable_incrementalsort = off;
>
> explain select * from (select * from t order by a,b) foo order by a,b,c;
> QUERY PLAN
> ------------------------------------------------------------------------
> Sort (cost=261402.69..263902.69 rows=1000000 width=12)
> Sort Key: t.a, t.b, t.c
> -> Sort (cost=132154.34..134654.34 rows=1000000 width=12)
> Sort Key: t.a, t.b
> -> Seq Scan on t (cost=0.00..15406.00 rows=1000000 width=12)
> (5 rows)
>
> So the cost with incremental sort was 264139, and after disabling the
> incremental cost it dropped to 263902. Granted, the difference is
> negligible in this case, but it's still surprising.
>
> Also, it can be made much more significant by reducing the number of
> prefix groups in the data:
>
> truncate t;
>
> insert into t select 1,1,1 from generate_series(1,1000000) s(i);
>
> analyze t;
>
> set enable_incrementalsort = on;
>
> explain select * from (select * from t order by a,b) foo order by a,b,c;
>
> QUERY PLAN
> ------------------------------------------------------------------------
> Incremental Sort (cost=324165.83..341665.85 rows=1000000 width=12)
> Sort Key: t.a, t.b, t.c
> Presorted Key: t.a, t.b
> -> Sort (cost=132154.34..134654.34 rows=1000000 width=12)
> Sort Key: t.a, t.b
> -> Seq Scan on t (cost=0.00..15406.00 rows=1000000 width=12)
> (6 rows)
>
> So that's 263902 vs. 341665, yet we still prefer the incremental mode.

Problem is well-defined, thank you.
I'll check what can be done in this field today.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavan Deolasee 2018-04-03 09:11:10 Re: Missing parse_merge.h?
Previous Message Satoshi Nagayasu 2018-04-03 09:07:02 Missing parse_merge.h?