Re: The case for removing replacement selection sort

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: The case for removing replacement selection sort
Date: 2017-09-28 22:44:59
Message-ID: CAH2-WzmSF_L5dQtf3JE55+z=QPSjijbSO90iLCDk_Vry_9RwOA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Sep 28, 2017 at 3:18 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Fri, Jul 14, 2017 at 6:20 PM, Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>> With the additional enhancements made to Postgres 10, I doubt that
>> there are any remaining cases where it wins.
>
> I tried my favorite sorting test case -- pgbench -i -s 300 and then
> reindex index pgbench_accounts_pkey. I set maintenance_work_mem to
> 4MB so that replacement selection would be chosen.
>
> On master, this takes ~21.5 seconds; with replacement_sort_tuples = 0,
> it takes ~19.1 seconds. So, disabling replacement selection is a win.
>
> On 9.6, this takes ~19.1 seconds; with replacement_sort_tuples = 0, it
> takes ~23 seconds. So, disabling replacement selection is a loss.
>
> This supports your theory that we should go ahead and remove
> replacement selection.

I'm glad to hear that. But, I should reiterate that if sorting
actually gets faster when my patch is applied, then that's something
that I consider to be a bonus. (This is primarily a refactoring patch,
to remove a bunch of obsolete code.)

> It does however both me a bit that this test
> case is apparently slower in master than in 9.6, and not just with
> very small values of work_mem. With default maintenance_work_mem
> (64MB), this takes about 13 seconds on 9.6 but about 15 seconds on
> master -- and with that setting replacement selection is not chosen at
> all.

As I mentioned, the picture changed for replacement selection during
the Postgres 10 development cycle, which caused me to finally push for
it to be killed in this thread. So, anyone that just started following
the thread should note that it's totally expected that replacement
selection still just about pulls its weight in 9.6, but not in 10.

> Any idea what causes this regression?

I don't know. My best guess is that the overall I/O scheduling is now
suboptimal due to commit e94568e (Heikki's preloading thing), because
this is CREATE INDEX, and there is new competitive pressure. You might
find it hard to replicate the problem with a "SELECT COUNT(DISTINCT
aid) FROM pgbench_accounts", which would confirm this explanation. Or,
you could also see what happens with a separate temp tablespace.

It's really, really hard to have a 100%, unambiguous win within
tuplesort.c. We've seen this time and again over the years.

--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-09-28 22:47:20 Re: Minor codegen silliness in ExecInterpExpr()
Previous Message Tom Lane 2017-09-28 22:39:03 Re: Minor codegen silliness in ExecInterpExpr()