"city" sort
===========
sort-city.sql:
select * from (select * from cities order by city offset 100000) d;

Master:
-------

pgbench -M prepared -f sort-city.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 102
latency average: 1176.471 ms
tps = 0.844444 (including connections establishing)
tps = 0.844468 (excluding connections establishing)

Patch:
------

pgbench -M prepared -f sort-city.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 0
number of threads: 1
duration: 120 s
number of transactions actually processed: 314
latency average: 382.166 ms
tps = 2.614520 (including connections establishing)
tps = 2.614549 (excluding connections establishing)

*** 309.61% of original transaction throughput ***


"province" sort
===============
sort-province.sql:
select * from (select * from cities order by province offset 100000) d;

Master:
-------

pgbench -M prepared -f sort-province.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 183
latency average: 655.738 ms
tps = 1.522669 (including connections establishing)
tps = 1.522684 (excluding connections establishing)

Patch:
------

pgbench -M prepared -f sort-province.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 458
latency average: 262.009 ms
tps = 3.813387 (including connections establishing)
tps = 3.813426 (excluding connections establishing)

*** 250.44% of original transaction throughput ***

"country" sort
==============
sort-country.sql:
select * from (select * from cities order by country offset 100000) d;

Master:
-------

pgbench -M prepared -f sort-country.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 219
latency average: 547.945 ms
tps = 1.822396 (including connections establishing)
tps = 1.822418 (excluding connections establishing)

Patch:
------

pgbench -M prepared -f sort-country.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 545
latency average: 220.183 ms
tps = 4.540027 (including connections establishing)
tps = 4.540077 (excluding connections establishing)

*** 249.12% of original transaction throughput ***

Heikki's worst case
===================

postgres=# create table sorttest (t text);
CREATE TABLE
postgres=# insert into sorttest select 'foobarfo' || (g) || repeat('a', 75) from generate_series(10000, 30000) g;
INSERT 0 20001

worst-sort.sql: select * from (select * from sorttest order by t offset 30000) d;

Master:
-------

pgbench -M prepared -f worst-sort.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 9409
latency average: 12.754 ms
tps = 78.404182 (including connections establishing)
tps = 78.404998 (excluding connections establishing)

Patch:
------

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 9304
latency average: 12.898 ms
tps = 77.533223 (including connections establishing)
tps = 77.534061 (excluding connections establishing)

*** 98.88% of original transaction throughput ***

Variant of Heikki's worst case
==============================

The original worst case is required to prove itself poorman applicable by
considering cardinality almost immediately when copying over heap tuples.
There is no "wait and see" period where an estimated 10% of all rows to be
sorted are observed before a firm conclusion is reached, purely because mean
query length so far is >= 64.  That's why the regression is now so marginal.
However, the obvious case to look at now becomes the same case where that isn't
quite true -- the case where we must pay for a futile "wait and see" period.

postgres=# create table sorttest (t text);
CREATE TABLE
postgres=# insert into sorttest select 'foobarfo' || (g) || repeat('a', 50) from generate_series(10000, 30000) g;
INSERT 0 20001
postgres=# analyze sorttest ;
ANALYZE

worst-sort.sql: select * from (select * from sorttest order by t offset 30000) d;

Master:
-------

pgbench -M prepared -f worst-sort.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 9537
latency average: 12.583 ms
tps = 79.471599 (including connections establishing)
tps = 79.472409 (excluding connections establishing)

Patch:
------

pgbench -M prepared -f worst-sort.sql -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 1
number of threads: 1
duration: 120 s
number of transactions actually processed: 9228
latency average: 13.004 ms
tps = 76.895525 (including connections establishing)
tps = 76.896292 (excluding connections establishing)

*** 96.7% of original transaction throughput ***

Multiple clients, "city" sort
=============================

Master:
-------

pgbench -M prepared -f sort-city.sql -j 4 -c 4 -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 4
number of threads: 4
duration: 120 s
number of transactions actually processed: 201
latency average: 2388.060 ms
tps = 1.665137 (including connections establishing)
tps = 1.665185 (excluding connections establishing)

Patch:
------

pgbench -M prepared -f sort-city.sql -j 4 -c 4 -T 120 -n

transaction type: Custom query
scaling factor: 1
query mode: prepared
number of clients: 4
number of threads: 4
duration: 120 s
number of transactions actually processed: 681
latency average: 704.846 ms
tps = 5.650860 (including connections establishing)
tps = 5.651000 (excluding connections establishing)

*** 339.36% of original transaction throughput ***