"city" sort =========== sort-city.sql: select * from (select * from cities order by city offset 100000) d; Master: ------- pgbench -M prepared -f sort-city.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 102 latency average: 1176.471 ms tps = 0.844444 (including connections establishing) tps = 0.844468 (excluding connections establishing) Patch: ------ pgbench -M prepared -f sort-city.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 0 number of threads: 1 duration: 120 s number of transactions actually processed: 314 latency average: 382.166 ms tps = 2.614520 (including connections establishing) tps = 2.614549 (excluding connections establishing) *** 309.61% of original transaction throughput *** "province" sort =============== sort-province.sql: select * from (select * from cities order by province offset 100000) d; Master: ------- pgbench -M prepared -f sort-province.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 183 latency average: 655.738 ms tps = 1.522669 (including connections establishing) tps = 1.522684 (excluding connections establishing) Patch: ------ pgbench -M prepared -f sort-province.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 458 latency average: 262.009 ms tps = 3.813387 (including connections establishing) tps = 3.813426 (excluding connections establishing) *** 250.44% of original transaction throughput *** "country" sort ============== sort-country.sql: select * from (select * from cities order by country offset 100000) d; Master: ------- pgbench -M prepared -f sort-country.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 219 latency average: 547.945 ms tps = 1.822396 (including connections establishing) tps = 1.822418 (excluding connections establishing) Patch: ------ pgbench -M prepared -f sort-country.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 545 latency average: 220.183 ms tps = 4.540027 (including connections establishing) tps = 4.540077 (excluding connections establishing) *** 249.12% of original transaction throughput *** Heikki's worst case =================== postgres=# create table sorttest (t text); CREATE TABLE postgres=# insert into sorttest select 'foobarfo' || (g) || repeat('a', 75) from generate_series(10000, 30000) g; INSERT 0 20001 worst-sort.sql: select * from (select * from sorttest order by t offset 30000) d; Master: ------- pgbench -M prepared -f worst-sort.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 9409 latency average: 12.754 ms tps = 78.404182 (including connections establishing) tps = 78.404998 (excluding connections establishing) Patch: ------ transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 9304 latency average: 12.898 ms tps = 77.533223 (including connections establishing) tps = 77.534061 (excluding connections establishing) *** 98.88% of original transaction throughput *** Variant of Heikki's worst case ============================== The original worst case is required to prove itself poorman applicable by considering cardinality almost immediately when copying over heap tuples. There is no "wait and see" period where an estimated 10% of all rows to be sorted are observed before a firm conclusion is reached, purely because mean query length so far is >= 64. That's why the regression is now so marginal. However, the obvious case to look at now becomes the same case where that isn't quite true -- the case where we must pay for a futile "wait and see" period. postgres=# create table sorttest (t text); CREATE TABLE postgres=# insert into sorttest select 'foobarfo' || (g) || repeat('a', 50) from generate_series(10000, 30000) g; INSERT 0 20001 postgres=# analyze sorttest ; ANALYZE worst-sort.sql: select * from (select * from sorttest order by t offset 30000) d; Master: ------- pgbench -M prepared -f worst-sort.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 9537 latency average: 12.583 ms tps = 79.471599 (including connections establishing) tps = 79.472409 (excluding connections establishing) Patch: ------ pgbench -M prepared -f worst-sort.sql -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 1 number of threads: 1 duration: 120 s number of transactions actually processed: 9228 latency average: 13.004 ms tps = 76.895525 (including connections establishing) tps = 76.896292 (excluding connections establishing) *** 96.7% of original transaction throughput *** Multiple clients, "city" sort ============================= Master: ------- pgbench -M prepared -f sort-city.sql -j 4 -c 4 -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 4 number of threads: 4 duration: 120 s number of transactions actually processed: 201 latency average: 2388.060 ms tps = 1.665137 (including connections establishing) tps = 1.665185 (excluding connections establishing) Patch: ------ pgbench -M prepared -f sort-city.sql -j 4 -c 4 -T 120 -n transaction type: Custom query scaling factor: 1 query mode: prepared number of clients: 4 number of threads: 4 duration: 120 s number of transactions actually processed: 681 latency average: 704.846 ms tps = 5.650860 (including connections establishing) tps = 5.651000 (excluding connections establishing) *** 339.36% of original transaction throughput ***