Re: [PATCH] Incremental sort (was: PoC: Partial sort)

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Shaun Thomas <shaun(dot)thomas(at)2ndquadrant(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL Developers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date: 2020-03-11 02:43:55
Message-ID: 20200311024355.5h2v56jpgzvd7yox@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-hackers


I've been focusing on the planner part of the patch, particularly on
picking places need to consider incremental sort paths. As discussed in
the past, we might simply consider incremental sort everywhere we need
sorted path, but that's likely an overkill. So I used the set of queries
used for previous tests [1] and determined which places affect the most
plans (using a separate GUC for each place, per [2]).

The result is attached, on top of the incremental sort patch. Part 0004
tweak all places to cover all plan changes in the test queries, and part
0005 tweaks a couple additional places where I think we should consider
incremental sort (but I've been unable to construct a query for which
it'd make a difference).

I've also removed the various GUCs and all the places are now using the
same enable_incrementalsort GUC. Compared to [2] I've also modified the
code to consider full and incremental sort in the same loop, which does
allow reusing some of the work.

The next thing I work on is determining how expensive this can be, in
extreme cases with many indexes etc.

Now, a couple comments about parts 0001 - 0003 of the patch ...

1) I see a bunch of failures in the regression test, due to minor
differences in the explain output. All the differences are about minor
changes in memory usage, like this:

- "Sort Space Used": 30, +
+ "Sort Space Used": 29, +

I'm not sure if it happens on my machine only, but maybe the test is not
entirely stable.

2) I think this bit in ExecReScanIncrementalSort is wrong:

node->sort_Done = false;
node->prefixsort_state = NULL;
node->prefixsort_state = NULL;
node->bound_Done = 0;

Notice both places reset fullsort_state and set prefixsort_state to
NULL. Another thing is that I'm not sure it's fine to pass NULL to
tuplesort_end (my guess is tuplesort_free will fail when it gets NULL).

3) Most of the execution plans look reasonable, except that some of the
plans look like this:

-> GroupAggregate
Group Key: t.a, t.b, t.c, t.d
-> Incremental Sort
Sort Key: t.a, t.b, t.c, t.d
Presorted Key: t.a, t.b, t.c
-> Incremental Sort
Sort Key: t.a, t.b, t.c
Presorted Key: t.a, t.b
-> Index Scan using t_a_b_idx on t
(10 rows)

i.e. there are two incremental sorts on top of each other, with
different prefixes. But this this is not a new issue - it happens with
queries like this:

SELECT a, b, c, d, count(*) FROM (
) foo GROUP BY a, b, c, d limit 1000;

i.e. there's a subquery with a subset of pathkeys. Without incremental
sort the plan looks like this:

-> GroupAggregate
Group Key: t.a, t.b, t.c, t.d
-> Sort
Sort Key: t.a, t.b, t.c, t.d
-> Sort
Sort Key: t.a, t.b, t.c
-> Seq Scan on t
(8 rows)

so essentially the same plan shape. What bugs me though is that there
seems to be some sort of memory leak, so that this query consumes
gigabytes os RAM before it gets killed by OOM. But the memory seems not
to be allocated in any memory context (at least MemoryContextStats don't
show anything like that), so I'm not sure what's going on.

Reproducing it is fairly simple:

CREATE TABLE t (a bigint, b bigint, c bigint, d bigint);
1000*random(), 1000*random(), 1000*random(), 1000*random()
FROM generate_series(1,10000000) s(i);
CREATE INDEX idx ON t(a,b);

EXPLAIN ANALYZE SELECT a, b, c, d, count(*)
FROM (SELECT * FROM t ORDER BY a, b, c) foo GROUP BY a, b, c, d
LIMIT 100;




Tomas Vondra
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
0001-Consider-low-startup-cost-when-adding-parti-20200310.patch text/plain 3.2 KB
0002-Implement-incremental-sort-20200310.patch text/plain 137.7 KB
0003-Rework-EXPLAIN-for-incremental-sort-20200310.patch text/plain 31.9 KB
0004-Consider-incremental-sort-paths-in-addition-20200310.patch text/plain 13.1 KB
0005-A-couple-more-places-for-incremental-sort-20200310.patch text/plain 8.8 KB

In response to


Browse pgsql-hackers by date

  From Date Subject
Next Message Masahiko Sawada 2020-03-11 03:00:41 Re: Berserk Autovacuum (let's save next Mandrill)
Previous Message Amit Kapila 2020-03-11 02:27:09 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager