Re: Using quicksort for every external sort run

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Using quicksort for every external sort run
Date: 2016-04-04 00:46:15
Message-ID: CAM3SWZR5rwzBEZunjNE-OM=xVwmbaBOkVasePXuGFqBcNwg7dw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sun, Apr 3, 2016 at 4:08 PM, Greg Stark <stark(at)mit(dot)edu> wrote:
>> SELECT * FROM (SELECT * FROM (SELECT a FROM int_test UNION SELECT a
>> FROM int_test_padding) gg OFFSET 1e10) ff;
>
> ISTM OFFSET binds more loosely than UNION so these should be equivalent.

Not exactly:

postgres=# explain analyze select i from fff union select i from ggg
offset 1e10;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=357771.51..357771.51 rows=1 width=4) (actual
time=2989.378..2989.378 rows=0 loops=1)
-> Unique (cost=345771.50..357771.51 rows=2400002 width=4)
(actual time=2031.044..2930.903 rows=1500001 loops=1)
-> Sort (cost=345771.50..351771.51 rows=2400002 width=4)
(actual time=2031.042..2543.167 rows=2400002 loops=1)
Sort Key: fff.i
Sort Method: external merge Disk: 32840kB
-> Append (cost=0.00..58620.04 rows=2400002 width=4)
(actual time=0.048..435.408 rows=2400002 loops=1)
-> Seq Scan on fff (cost=0.00..14425.01
rows=1000001 width=4) (actual time=0.048..100.435 rows=1000001
loops=1)
-> Seq Scan on ggg (cost=0.00..20195.01
rows=1400001 width=4) (actual time=0.042..138.991 rows=1400001
loops=1)
Planning time: 0.123 ms
Execution time: 2999.564 ms
(10 rows)

postgres=# explain analyze select * from (select i from fff union
select i from ggg) fg offset 1e10;
QUERY
PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=381771.53..381771.53 rows=1 width=4) (actual
time=2982.519..2982.519 rows=0 loops=1)
-> Unique (cost=345771.50..357771.51 rows=2400002 width=4)
(actual time=2009.176..2922.874 rows=1500001 loops=1)
-> Sort (cost=345771.50..351771.51 rows=2400002 width=4)
(actual time=2009.174..2522.761 rows=2400002 loops=1)
Sort Key: fff.i
Sort Method: external merge Disk: 32840kB
-> Append (cost=0.00..58620.04 rows=2400002 width=4)
(actual time=0.056..428.934 rows=2400002 loops=1)
-> Seq Scan on fff (cost=0.00..14425.01
rows=1000001 width=4) (actual time=0.055..100.806 rows=1000001
loops=1)
-> Seq Scan on ggg (cost=0.00..20195.01
rows=1400001 width=4) (actual time=0.042..139.994 rows=1400001
loops=1)
Planning time: 0.127 ms
Execution time: 2993.294 ms
(10 rows)

The startup and total costs are greater in the latter case, but the
costs match at and below the Unique node. Whether or not this was
relevant is probably unimportant, though. My habit is to do the offset
outside of the subquery.

My theory is that the master branch happened to get a HashAggregate
for the 128MB case that caused us both confusion, because it looked
cheaper than an external sort + unique when the sort required many
passes on the master branch only (where my cost_sort() changes that
lower the costing of external sorts were not included). This wasn't a
low cardinality case, so the HashAggregate may have only won by a
small amount. I suppose that this could happen when the HashAggregate
was not predicted to use memory > work_mem, but a sort was. Then, as
the sort requires fewer merge passes with more work_mem, the master
branch starts to agree with the patch on the cheapest plan once again.
The trend of the patch being faster continues, after this one hiccup.

This is down to the cost_sort() changes, not the tuplesort.c changes.
But this was just a quirk, and the trend still seems clear. This
theory seems very likely based on this strange query's numbers for i5
master as work_mem increases:

Master: 16.711, 9.94, 4.891, 8.32, 4.88

Patch: 17.23, 9.77, 9.78, 4.95, 4.94

ISTM that master's last and third-from-last cases *both* use a
HashAggregate, where the patch behaves more consistently. After all,
the patch does smooth the cost function of sorting, an independently
useful goal to simply making sorting faster. We don't have to be
afraid of crossing an arbitrary, fuzzy threshold.

--
Peter Geoghegan

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2016-04-04 01:39:03 Re: Recovery test failure for recovery_min_apply_delay on hamster
Previous Message Corey Huinker 2016-04-04 00:42:37 Re: psql metaqueries with \gexec