Quick Links

Re: Using quicksort for every external sort run

From:	Robert Haas <robertmhaas(at)gmail(dot)com>
To:	Peter Geoghegan <pg(at)heroku(dot)com>
Cc:	Simon Riggs <simon(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>, Greg S <stark(at)mit(dot)edu>
Subject:	Re: Using quicksort for every external sort run
Date:	2016-01-30 13:29:38
Message-ID:	CA+TgmobVGAe-o7pro0j+CbBQKaR98Q_FspzHHVP_unsB+FQzDw@mail.gmail.com
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On Sat, Jan 30, 2016 at 2:25 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> On Fri, Jan 29, 2016 at 2:58 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>> I don't quite know what you mean by these numbers. Add a generic,
>> conservative threshold to what?
>
> I meant use "quicksort with spillover" simply because an estimated
> 90%+ of all tuples have already been consumed. Don't consider the
> tuple width, etc.

Hmm, it's a thought.

>> Thinking about this some more, I really think we should think hard
>> about going back to the strategy which you proposed and discarded in
>> your original post: always generate the first run using replacement
>> selection, and every subsequent run by quicksorting. In that post you
>> mention powerful advantages of this method: "even without a strong
>> logical/physical correlation, the algorithm tends to produce runs that
>> are about twice the size of work_mem. (It's also notable that
>> replacement selection only produces one run with mostly presorted
>> input, even where input far exceeds work_mem, which is a neat trick.)"
>> You went on to dismiss that strategy, saying that "despite these
>> upsides, replacement selection is obsolete, and should usually be
>> avoided." But I don't see that you've justified that statement.
>
> Really? Just try it with a heap that is not tiny. Performance tanks.
> The fact that replacement selection can produce one long run then
> becomes a liability, not a strength. With a work_mem of something like
> 1GB, it's *extremely* painful.

I'm not sure exactly what you think I should try. I think a couple of
people have expressed the concern that your patch might regress things
on data that is all in order, but I'm not sure if you think I should
try that case or some case that is not-quite-in-order. "I don't see
that you've justified that statement" is referring to the fact that
you presented no evidence in your original post that it's important to
sometimes use quicksorting even for run #1. If you've provided some
test data illustrating that point somewhere, I'd appreciate a pointer
back to it.

> A compromise that may be acceptable is to always do a "quicksort with
> spillover" when there is a very low work_mem setting and the estimate
> of the number of input tuples is less than 10x of what we've seen so
> far. Maybe less than 20MB. That will achieve the same thing.

How about always starting with replacement selection, but limiting the
amount of memory that can be used with replacement selection to some
small value? It could be a separate GUC, or a hard-coded constant
like 20MB if we're fairly confident that the same value will be good
for everyone. If the tuples aren't in order, then we'll pretty
quickly come to the end of the first run and switch to quicksort. If
we do end up using replacement selection for the whole sort, the
smaller heap is an advantage. What I like about this sort of thing is
that it adds no reliance on any estimate; it's fully self-tuning.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

In response to

Re: Using quicksort for every external sort run at 2016-01-30 07:25:38 from Peter Geoghegan

Responses

Re: Using quicksort for every external sort run at 2016-02-04 09:46:15 from Peter Geoghegan

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Michael Paquier	2016-01-30 14:08:35	Re: Re: BUG #13685: Archiving while idle every archive_timeout with wal_level hot_standby
Previous Message	Petr Jelinek	2016-01-30 13:03:47	Re: Sequence Access Method WIP