Re: Sorting Improvements for 8.4

From: Ron Mayer <rm_pg(at)cheapcomplexdevices(dot)com>
To: Mark Mielke <mark(at)mark(dot)mielke(dot)cc>
Subject: Re: Sorting Improvements for 8.4
Date: 2007-12-19 18:26:17
Message-ID: 47696249.4090602@cheapcomplexdevices.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Mark Mielke wrote:
> I am curious - what algorithms exist to efficiently do a parallel sort?
> Do you mean if sorting 1 million items, it is possible to separate this
> into 2 sets of 500 thousand each, execute them in separate threads
> (with task administration and synchronization overhead) , combine the
> results, and complete the task in significantly less time than doing it
> in one thread? I am skeptical that this is possible...

The link in the beginning of the thread points to articles
that seem to describe one such algorithm; along with benchmarks.
(http://tinyurl.com/3bvu4u, http://tinyurl.com/32wg2m)
The improvements were pretty consistent from set sizes ranging
from very small sets (hundreds) to quite large ones (hundreds of K).

Interestingly, even multi-threading helped a lot.

"Our tests correlate well with previous research that showed
Intel’s implementation of SMT (Hyper-Threading) to be
adept at hiding this latency [6, 20, 12].Table 4 shows that by
having two threads access memory at the same time, performance
improved over 80% when compared to the singlethreaded version.

It uses both quicksort phases and merge phases; for the merge phase
using 2CPUs (no hyperthreading) apparently gave more than 2X speed
improvement; apparently because it could parallelize memory access
with CPU more.

> Or do you mean being able to perform parts of the query plan fully in
> parallel? If this, then one would need a lot more than ParallelSort...

I wouldn't recommend that - it seems like a Hard Problem.

My guess is that the best way to use multiple threads in one backend
would be to find specific algorithms like sorting that would be
easier to isolate.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2007-12-19 18:46:15 Re: [HACKERS] Proposal for Null Bitmap Optimization(for TrailingNULLs)
Previous Message Jeff Davis 2007-12-19 18:18:33 Re: Sorting Improvements for 8.4