Re: The case for removing replacement selection sort

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Greg Stark <stark(at)mit(dot)edu>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: The case for removing replacement selection sort
Date: 2017-09-15 13:34:34
Message-ID: e7b074f5-77cf-0d69-b3a7-9da1d2893f63@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Attached is a spreadsheet with updated data (using the more accurate
timing, and comparing master with the default replacement_sort_tuples
value (150k) and increased per Peter's instructions (to 1B).

There are 6 sheets - one for each combination of a dataset size (100k
and 1M) and machine where I ran the tests (with different CPU models).

There are 5 columns - first three are medians for each of the tested
PostgreSQL configurations:

- master (default)
- master (1billion)
- no-replacement-sort (with patch applied)

The numbers are medians from 5 runs (perhaps minimums would be better in
this case).

The last two columns are comparisons / speed-ups

- master(1B) / master(default)
- no-replacement-sort / master(default)

Green numbers (below 1.0) mean speed-up, red (above 1.0) slow-down.

Firstly, the master with r_s_t=1B setting performs either the same or
worse compared to a default values, in almost every test. On the 100k
data set the results are a bit noisy (particularly on the oldest CPU),
but on the 1M data set the difference is quite clear. So I've only
compared results for master(default) and patched.

Quick summary, for each CPU model (which clearly affects the behavior).

e5-2620-v4
----------
- probably the CPU we should be looking at, as it's the current model
- in some cases this gives us 3-5x speedup with low work_mem settings
- consider for example the very last line, which shows that

SELECT DISTINCT a FROM text_test_padding ORDER BY a

completed in ~5531ms on work_mem=8MB and 1067ms on 32MB, but with the
patch it completes in 1784ms (1MB), 1211ms (4MB) and 1104 (8MB).

- Of course, this is for already-sorted data, and for other data sets
the improvement is more modest. It's difficult to summarize this into a
single number, but there are plenty of improvements in 20-30% range.

- Some cases are a bit slower, of course, but overall I'd say the chart
is much more green than red. Also the slow-downs are much smaller,
compared to speed-ups (generally within 1-5%).

i5-2500k
--------
- same story, but this CPU gives more stable results (less noise)

e5-5450
-------
- rather noisy CPU, particularly on the small (100k) dataset
- the larger data set mostly matches the other CPUs, although the
regressions are somewhat more significant
- I wouldn't really worry about this too much, it's clearly an obsolete
CPU and not something performance-conscious person would use nowadays
(the other CPUs are often 2-3x faster).

If needed, full data is available here (each machine is pushing data to
a separate git repository):

* https://bitbucket.org/tvondra/sort-bench-i5-2500k
* https://bitbucket.org/tvondra/sort-bench-e5-2620-v4
* https://bitbucket.org/tvondra/sort-bench-e5-5450

At this point the 10M row tests are running, but I don't expect anything
particularly surprising from those results. That is, it's not something
that should block getting this patch committed, if the agreement is to
commit otherwise.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment Content-Type Size
sort-benchmarks.ods application/vnd.oasis.opendocument.spreadsheet 1.0 MB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2017-09-15 13:49:43 Re: additional contrib test suites
Previous Message Nico Williams 2017-09-15 13:12:07 Re: COMMIT TRIGGERs, take n, implemented with CONSTRAINT TRIGGERS