Re: Using quicksort for every external sort run

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Using quicksort for every external sort run
Date: 2015-12-01 00:29:38
Message-ID: CAM3SWZRTODngPxko3vJ5CcUrGzVqpQ7_Ff75Sioxq8+Hv=d4_A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 30, 2015 at 12:29 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> I'm kind of curious as to why the optimal for the patched code appears
>> at 1GB and not lower. If I get a chance to rebuild the test, I will
>> look into that more.
>
> I think that the availability of abbreviated keys (or something that
> allows most comparisons made by quicksort/the heap to be resolved at
> the SortTuple level) could make a big difference for things like this.

Using the Hydra POWER7 server [1] + the gensort benchmark [2], which
uses the C collation, and has abbreviated keys that have lots of
entropy, I see benefits with higher and higher maintenance_work_mem
settings.

I will present a variety of cases, which seemed like something Greg
Stark is particularly interested in. On the whole, I am quite pleased
with how things are shown to be improved in a variety of different
scenarios.

Looking at CREATE INDEX build times on an (unlogged) gensort table
with 50 million, 100 million, 250 million, and 500 million tuples,
with maintenance_work_mem settings of 512MB, 1GB, 10GB, and 15GB,
there are sustained improvements as more memory is made available. I'm
not saying that that would be the case with low cardinality leading
attribute tuples -- probably not -- but it seems pretty nice that this
case can sustain improvements as more memory is made available. The
server used here has reasonably good disks (Robert goes into this in
his blogpost), but nothing spectacular.

This is what a 500 million tuple gensort table looks like:

postgres=# \dt+
List of relations
Schema | Name | Type | Owner | Size | Description
--------+-----------+-------+-------+-------+-------------
public | sort_test | table | pg | 32 GB |
(1 row)

Results:

50 million tuple table (best of 3):
------------------------------------------

512MB: (8-way final merge) external sort ended, 171058 disk blocks
used: CPU 4.11s/79.30u sec elapsed 83.60 sec
1GB: (4-way final merge) external sort ended, 171063 disk blocks used:
CPU 4.29s/71.34u sec elapsed 75.69 sec
10GB: N/A
15GB: N/A
1GB (master branch): (3-way final merge) external sort ended, 171064
disk blocks used: CPU 6.19s/163.00u sec elapsed 170.84 sec

100 million tuple table (best of 3):
--------------------------------------------

512MB: (16-way final merge) external sort ended, 342114 disk blocks
used: CPU 8.61s/177.77u sec elapsed 187.03 sec
1GB: (8-way final merge) external sort ended, 342124 disk blocks used:
CPU 8.07s/165.15u sec elapsed 173.70 sec
10GB: N/A
15GB: N/A
1GB (master branch): (5-way final merge) external sort ended, 342129
disk blocks used: CPU 11.68s/358.17u sec elapsed 376.41 sec

250 million tuple table (best of 3):
--------------------------------------------

512MB: (39-way final merge) external sort ended, 855284 disk blocks
used: CPU 19.96s/486.57u sec elapsed 507.89 sec
1GB: (20-way final merge) external sort ended, 855306 disk blocks
used: CPU 22.63s/475.33u sec elapsed 499.09 sec
10GB: (2-way final merge) external sort ended, 855326 disk blocks
used: CPU 21.99s/341.34u sec elapsed 366.15 sec
15GB: (2-way final merge) external sort ended, 855326 disk blocks
used: CPU 23.23s/322.18u sec elapsed 346.97 sec
1GB (master branch): (11-way final merge) external sort ended, 855315
disk blocks used: CPU 30.56s/973.00u sec elapsed 1015.63 sec

500 million tuple table (best of 3):
--------------------------------------------

512MB: (77-way final merge) external sort ended, 1710566 disk blocks
used: CPU 45.70s/1016.70u sec elapsed 1069.02 sec
1GB: (39-way final merge) external sort ended, 1710613 disk blocks
used: CPU 44.34s/1013.26u sec elapsed 1067.16 sec
10GB: (4-way final merge) external sort ended, 1710649 disk blocks
used: CPU 46.46s/772.97u sec elapsed 841.35 sec
15GB: (3-way final merge) external sort ended, 1710652 disk blocks
used: CPU 51.55s/729.88u sec elapsed 809.68 sec
1GB (master branch): (20-way final merge) external sort ended, 1710632
disk blocks used: CPU 69.35s/2013.21u sec elapsed 2113.82 sec

I attached a detailed account of these benchmarks, for those that
really want to see the nitty-gritty. This includes a 1GB case for
patch without memory prefetching (which is not described in this
message).

[1] http://rhaas.blogspot.com/2012/03/performance-and-scalability-on-ibm.html
[2] https://github.com/petergeoghegan/gensort
--
Peter Geoghegan

Attachment Content-Type Size
sort-benchmarks.tar.gz application/x-gzip 55.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2015-12-01 00:31:15 Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Previous Message Michael Paquier 2015-12-01 00:18:52 Re: Additional role attributes && superuser review