Re: Using quicksort for every external sort run

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Greg Stark <stark(at)mit(dot)edu>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
Subject: Re: Using quicksort for every external sort run
Date: 2016-03-29 16:11:25
Message-ID: CA+TgmoaXYtofbXdjzBneEjsx8a1Z9A+TVB1mSTUFAnrdNu=BTA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Mar 28, 2016 at 11:18 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> Note that amcheck V2, which I posted just now features tests for
> external sorting. The way these work requires discussion. The tests
> are motivated in part by the recent strxfrm() debacle, as well as by
> the need to have at least some test coverage for this patch. It's bad
> that external sorting currently has no test coverage. We should try
> and do better there as part of this overhaul to tuplesort.c.

Test coverage is good!

However, I don't see that you've responded to Tomas Vondra's report of
regressions. Maybe you're waiting for more data from him, but we're
running out of time here. I think what we need to decide is whether
these results are bad enough that the patch needs more work on the
regressed cases, or whether we're comfortable with some regressions in
low-memory configurations for the benefit of higher-memory
configurations. I'm kind of on the fence about that, myself.

One test that kind of bothers me in particular is the "SELECT DISTINCT
a FROM numeric_test ORDER BY a" test on the high_cardinality_random
data set. That's a wash at most work_mem values, but at 32MB it's
more than 3x slower. That's very strange, and there are a number of
other results like that, where one particular work_mem value triggers
a large regression. That's worrying.

Also, it's pretty clear that the patch has more large wins than it
does large losses, but it seems pretty easy to imagine people who
haven't tuned any GUCs writing in to say that 9.6 is way slower on
their workload, because those people are going to be at work_mem=4MB,
maintenance_work_mem=64MB. At those numbers, if Tomas's data is
representative, it's not hard to imagine that the number of people who
see a significant regression might be quite a bit larger than the
number who see a significant speedup.

On the whole, I'm tempted to say this needs more work before we commit
to it, but I'd like to hear other opinions on that point.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2016-03-29 16:12:19 Re: extend pgbench expressions with functions
Previous Message Magnus Hagander 2016-03-29 16:11:03 Re: [HACKERS] BUG #13854: SSPI authentication failure: wrong realm name used