Re: Using quicksort for every external sort run

From: Peter Geoghegan <pg(at)heroku(dot)com>
To: Greg Stark <stark(at)mit(dot)edu>
Cc: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Robert Haas <robertmhaas(at)gmail(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>
Subject: Re: Using quicksort for every external sort run
Date: 2015-08-20 22:16:48
Message-ID: CAM3SWZT-MhTAF4DNDPxCuDYRRQ-rFYqqtAr0OgB_nD3+gnQpOg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 20, 2015 at 6:05 AM, Greg Stark <stark(at)mit(dot)edu> wrote:
> On Thu, Aug 20, 2015 at 3:24 AM, Peter Geoghegan <pg(at)heroku(dot)com> wrote:
>> I believe, in general, that we should consider a multi-pass sort to be
>> a kind of inherently suspect thing these days, in the same way that
>> checkpoints occurring 5 seconds apart are: not actually abnormal, but
>> something that we should regard suspiciously. Can you really not
>> afford enough work_mem to only do one pass? Does it really make sense
>> to add far more I/O and CPU costs to avoid that other tiny memory
>> capacity cost?
>
> I think this is the crux of the argument. And I think you're
> basically, but not entirely, right.

I agree that that's the crux of my argument. I disagree about my not
being entirely right. :-)

> The key metric there is not how cheap memory has gotten but rather
> what the ratio is between the system's memory and disk storage. The
> use case I think you're leaving out is the classic "data warehouse"
> with huge disk arrays attached to a single host running massive
> queries for hours. In that case reducing run size will reduce I/O
> requirements directly and halving the amount of I/O sort takes will
> halve the time it takes regardless of cpu efficiency. And I have a
> suspicion typical data distributions get much better than a 2x
> speedup.

It could reduce seek time, which might be the dominant cost (but not
I/O as such). I do accept that my argument did not really apply to
this case, but you seem to be making an additional non-conflicting
argument that certain data warehousing cases would be helped in
another way by my patch. My argument was only about multi-gigabyte
cases that I tested that were significantly improved, primarily due to
CPU caching effects. If this helps with extremely large sorts that do
require multiple passes by reducing seek time -- I think that they'd
have to be multi-terabyte sorts, which I am ill-equipped to test --
then so much the better, I suppose.

In any case, as I've said the way we allow run size to be dictated
only by available memory (plus whatever replacement selection can do
to make on-tape runs longer) is bogus. In the future there should be a
cost model for an optimal run size, too.

> But I think you're basically right that this is the wrong use case to
> worry about for most users. Even those users that do have large batch
> queries are probably not processing so much that they should be doing
> multiple passes. The ones that do are are probably more interested in
> parallel query, federated databases, column stores, and so on rather
> than worrying about just how many hours it takes to sort their
> multiple terabytes on a single processor.

I suppose so. If you can afford multiple terabytes of storage, you can
probably still afford gigabytes of memory to do a single pass. My
laptop is almost 3 years old, weighs about 1.5 Kg, and has 16 GiB of
memory. It's usually always that simple, and not really because we
assume that Postgres doesn't have to deal with multi-terabyte sorts.
Maybe I lack perspective, having never really dealt with a real data
warehouse. I didn't mean to imply that in no circumstances could
anyone profit from a multi-pass sort. If you're using Hadoop or
something, I imagine that it still makes sense.

In general, I think you'll agree that we should strongly leverage the
fact that a multi-pass sort just isn't going to be needed when things
are set up correctly under standard operating conditions nowadays.

> I am quite suspicious of quicksort though. It has O(n^2) worst case
> and I think it's only a matter of time before people start worrying
> about DOS attacks from users able to influence the data ordering. It's
> also not very suitable for GPU processing. Quicksort gets most of its
> advantage from cache efficiency, it isn't a super efficient algorithm
> otherwise, are there not other cache efficient algorithms to consider?

I think that high quality quicksort implementations [1] will continue
to be the way to go for sorting integers internally at the very least.
Practically speaking, problems with the worst case performance have
been completely ironed out since the early 1990s. I think it's
possible to DOS Postgres by artificially introducing a worst-case, but
it's very unlikely to be the easiest way of doing that in practice. I
admit that it's probably the coolest way, though.

I think that the benefits of offloading sorting to the GPU are not in
evidence today. This may be especially true of a "street legal"
implementation that takes into account all of the edge cases, as
opposed to a hand customized thing for sorting uniformly distributed
random integers. GPU sorts tend to use radix sort, and I just can't
see that catching on.

[1] https://www.cs.princeton.edu/~rs/talks/QuicksortIsOptimal.pdf
--
Peter Geoghegan

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arthur Silva 2015-08-20 22:58:27 Re: 9.5 release notes
Previous Message Tomas Vondra 2015-08-20 21:33:47 Re: (full) Memory context dump considered harmful