Re: Using quicksort for every external sort run

From: David Fetter <david(at)fetter(dot)org>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)heroku(dot)com>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Using quicksort for every external sort run
Date: 2015-11-30 04:02:59
Message-ID: 20151130040259.GA22300@fetter.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Nov 28, 2015 at 02:04:16PM -0800, Jeff Janes wrote:
> On Wed, Nov 18, 2015 at 3:29 PM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
> > On Wed, Nov 18, 2015 at 10:31 AM, Jeff Janes <jeff(dot)janes(at)gmail(dot)com> wrote:
> >
> >> I agree we don't want to optimize for low memory, but I don't think we
> >> should throw it under the bus, either. Right now we are effectively
> >> saying the CPU-cache problems with the heap start exceeding the larger
> >> run size benefits at 64kb (the smallest allowed setting for work_mem).
> >> While any number we pick is going to be a guess that won't apply to
> >> all hardware, surely we can come up with a guess better than 64kb.
> >> Like, 8 MB, say. If available memory for the sort is 8MB or smaller
> >> and the predicted size anticipates a multipass merge, then we can use
> >> the heap method rather than the quicksort method. Would a rule like
> >> that complicate things much?
> >
> > I'm already using replacement selection for the first run when it is
> > predicted by my new ad-hoc cost model that we can get away with a
> > "quicksort with spillover", avoiding almost all I/O. We only
> > incrementally spill as many tuples as needed right now, but it would
> > be pretty easy to not quicksort the remaining tuples, but continue to
> > incrementally spill everything. So no, it wouldn't be too hard to hang
> > on to the old behavior sometimes, if it looked worthwhile.
> >
> > In principle, I have no problem with doing that. Through testing, I
> > cannot see any actual upside, though. Perhaps I just missed something.
> > Even 8MB is enough to avoid the multipass merge in the event of a
> > surprisingly high volume of data (my work laptop is elsewhere, so I
> > don't have my notes on this in front of me, but I figured out the
> > crossover point for a couple of cases).
>
> For me very large sorts (100,000,000 ints) with work_mem below 4MB do
> better with unpatched than with your patch series, by about 5%. Not a
> big deal, but also if it is easy to keep the old behavior then I think
> we should. Yes, it is dumb to do large sorts with work_mem below 4MB,
> but if you have canned apps which do a mixture of workloads it is not
> so easy to micromanage their work_mem. Especially as there are no
> easy tools that let me as the DBA say "if you connect from this IP
> address, you get this work_mem".

That's certainly doable with pgbouncer, for example. What would you
have in mind for the more general capability? It seems to me that
bloating up pg_hba.conf would be undesirable, but maybe I'm picturing
this as bigger than it actually needs to be.

Cheers,
David.
--
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2015-11-30 04:52:40 Re: Memory prefetching while sequentially fetching from SortTuple array, tuplestore
Previous Message Tom Lane 2015-11-30 02:39:08 Re: Issue on C function that reads int2[] (using "int2vector")