Re: [HACKERS] [PATCH] Incremental sort

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
Cc: Teodor Sigaev <teodor(at)sigaev(dot)ru>, Darafei Komяpa Praliaskouski <me(at)komzpa(dot)net>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Antonin Houska <ah(at)cybertec(dot)at>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: [HACKERS] [PATCH] Incremental sort
Date: 2018-04-07 15:48:33
Message-ID: CAPpHfdvU2YAom5w=QtJnaZBiyA4SYVsc9VCoF_Fj2YVfmhVTZQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Mar 28, 2018 at 6:38 PM, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>
wrote:

> Teodor Sigaev wrote:
> > > BTW, patch had conflicts with master. Please, find rebased version
> attached.
> >
> > Despite by patch conflist patch looks commitable, has anybody objections
> to
> > commit it?
> >
> > Patch recieved several rounds of review during 2 years, and seems to me,
> > keeping it out from sources may cause a lost it. Although it suggests
> > performance improvement in rather wide usecases.
>
> Can we have a recap on what the patch *does*? I see there's a
> description in Alexander's first email
> https://postgr.es/m/CAPpHfdscOX5an71nHd8WSUH6GNOCf=
> V7wgDaTXdDd9=goN-gfA(at)mail(dot)gmail(dot)com
> but that was a long time ago, and the patch has likely changed in the
> meantime ...
>

Ggeneral idea hasn't been changed much since first email.
Incremental sort gives benefit when you need to sort your dataset
by some list of columns while you alredy have input presorted
by some prefix of that list of columns. Then you don't do full sort
of dataset, but rather sort groups where values of prefix columns
are equal (see header comment in nodeIncremenalSort.c).

Same example as in the first letter works, but plan displays
differently.

create table test as (select id, (random()*10000)::int as v1, random() as
v2 from generate_series(1,1000000) id);
create index test_v1_idx on test (v1);

# explain select * from test order by v1, v2 limit 10;
QUERY PLAN
-----------------------------------------------------------------------------------------------
Limit (cost=1.26..1.26 rows=10 width=16)
-> Incremental Sort (cost=1.26..1.42 rows=1000000 width=16)
Sort Key: v1, v2
Presorted Key: v1
-> Index Scan using test_v1_idx on test (cost=0.42..47602.50
rows=1000000 width=16)
(5 rows)

# select * from test order by v1, v2 limit 10;
id | v1 | v2
--------+----+--------------------
216426 | 0 | 0.0697950166650116
96649 | 0 | 0.230586454737931
892243 | 0 | 0.677791305817664
323001 | 0 | 0.708638620562851
87458 | 0 | 0.923310813494027
224291 | 0 | 0.9349579163827
446366 | 0 | 0.984529701061547
376781 | 0 | 0.997424073051661
768246 | 1 | 0.127851997036487
666102 | 1 | 0.27093240711838
(10 rows)

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2018-04-07 15:48:45 Re: [HACKERS] path toward faster partition pruning
Previous Message Andres Freund 2018-04-07 15:26:39 Re: Online enabling of checksums