Re: some aspects of our qsort might not be ideal

From: John Naylor <john(dot)naylor(at)enterprisedb(dot)com>
To: Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Robert Haas <robertmhaas(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: some aspects of our qsort might not be ideal
Date: 2022-06-30 10:21:19
Message-ID: CAFBsxsGK4zn7UH3Yr5QTCA7Qf6u3c+0xgHeL6m+0Jmo8VA_--Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I've run the microbenchmarks only so far, since they complete much
faster than queries, and I wanted to tell quickly if this is a good
avenue to pursue. Later I'll repeat for queries. Methodology is
similar to what I did earlier in the thread, so I won't belabor that.

Takeaways:

- Contrary to some testing on single pivot I did some month ago with
fewer input distributions, in this test neither single nor dual pivot
benefit much from a large insertion sort threshold (i.e. 20 or so). I
picked 12 for both to do the head-to-head comparison, since this seems
to give a slight boost to specialized sorts on the slowest inputs.
They don't have to be the same, of course, but it seems about right
with these results. (Of course, it's easy to have specialized sorts
define their own threshold, as Thomas Munro has shown.)
- Generic qsort (type length and comparator passed at runtime) sees no
benefit from dual-pivot in this test, but specialized qsorts do get a
decent speedup.
- Descending inputs get a huge boost when specialized. This is
surprising to me, enough that I'm skeptical and want to double check
the test. If it's legitimate, I'm happy about it.

I've attached three spreadsheets with graphs, two for threshold tests
for single- and dual-pivot, and one comparing single and dual for
threshold=12.

0001 is the test module and rough script (not for commit). Note: The
server won't build, since the threshold is passed via CPPFLAGS.
0002 is trivial and mostly for assert builds
0003 is a mostly cleaned up patch for dual pivot, with passing
regression tests and default insertion sort threshold of 12

I'll add a CF entry for this.

TODO: add results for queries.
--
John Naylor
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v2-0001-Add-sort-test-module.patch text/x-patch 14.0 KB
v2-0002-Create-internal-workhorse-for-ST_SORT.patch text/x-patch 2.9 KB
v2-0003-Implement-dual-pivot-quicksort.patch text/x-patch 68.0 KB
v2-dual-vs-single-12.ods application/vnd.oasis.opendocument.spreadsheet 43.7 KB
dualpivot-thresholds-microbenchmark-20220630-graph.ods application/vnd.oasis.opendocument.spreadsheet 64.8 KB
singlepivot-thresholds-microbenchmark-20220630-graph.ods application/vnd.oasis.opendocument.spreadsheet 66.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Yugo NAGATA 2022-06-30 10:38:48 Support TRUNCATE triggers on foreign tables
Previous Message Alvaro Herrera 2022-06-30 09:57:30 Re: Add index item for MERGE.