Re: [PERFORM] Releasing memory during External sorting?

From: Ron Peacetree <rjpeace(at)earthlink(dot)net>
To: Dann Corbit <DCorbit(at)connx(dot)com>, pgsql-hackers(at)postgresql(dot)org, pgsql-performance(at)postgresql(dot)org
Subject: Re: [PERFORM] Releasing memory during External sorting?
Date: 2005-09-24 10:30:47
Message-ID: 24299535.1127557847225.JavaMail.root@elwamui-huard.atl.sa.earthlink.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-performance

From: Dann Corbit <DCorbit(at)connx(dot)com>
Sent: Sep 23, 2005 5:38 PM
Subject: RE: [HACKERS] [PERFORM] Releasing memory during External sorting?

>_C Unleashed_ also explains how to use a callback function to perform
>arbitrary radix sorts (you simply need a method that returns the
>[bucketsize] most significant bits for a given data type, for the length
>of the key).
>
>So you can sort fairly arbitrary data in linear time (of course if the
>key is long then O(n*log(n)) will be better anyway.)
>
>But in any case, if we are talking about external sorting, then disk
>time will be so totally dominant that the choice of algorithm is
>practically irrelevant.
>
Horsefeathers. Jim Gray's sorting contest site:
http://research.microsoft.com/barc/SortBenchmark/

proves that the choice of algorithm can have a profound affect on
performance. After all, the amount of IO done is the most
important of the things that you should be optimizing for in
choosing an external sorting algorithm.

Clearly, if we know or can assume the range of the data in question
the theoretical minimum amount of IO is one pass through all of the
data (otherwise, we are back in O(lg(n!)) land ). Equally clearly, for
HD's that one pass should involve as few seeks as possible.

In fact, such a principle can be applied to _all_ forms of IO: HD,
RAM, and CPU cache. The absolute best that any sort can
possibly do is to make one pass through the data and deduce the
proper ordering of the data during that one pass.

It's usually also important that our algorithm be Stable, preferably
Wholly Stable.

Let's call such a sort Optimal External Sort (OES). Just how much
faster would it be than current practice?

The short answer is the difference between how long it currently
takes to sort a file vs how long it would take to "cat" the contents
of the same file to a RAM buffer (_without_ displaying it). IOW,
there's SIGNIFICANT room for improvement over current
standard practice in terms of sorting performance, particularly
external sorting performance.

Since sorting is a fundamental operation in many parts of a DBMS,
this is a Big Deal.

This discussion has gotten my creative juices flowing. I'll post
some Straw Man algorithm sketches after I've done some more
thought.

Ron

> -----Original Message-----
> From: Dann Corbit <DCorbit(at)connx(dot)com>
> Sent: Friday, September 23, 2005 2:21 PM
> Subject: Re: [HACKERS] [PERFORM] Releasing memory during ...
>
>For the subfiles, load the top element of each subfile into a priority
>queue. Extract the min element and write it to disk. If the next
>value is the same, then the queue does not need to be adjusted.
>If the next value in the subfile changes, then adjust it.
>
>Then, when the lowest element in the priority queue changes, adjust
>the queue.
>
>Keep doing that until the queue is empty.
>
>You can create all the subfiles in one pass over the data.
>
>You can read all the subfiles, merge them, and write them out in a
>second pass (no matter how many of them there are).
>
The Gotcha with Priority Queues is that their performance depends
entirely on implementation. In naive implementations either Enqueue()
or Dequeue() takes O(n) time, which reduces sorting time to O(n^2).

The best implementations I know of need O(lglgn) time for those
operations, allowing sorting to be done in O(nlglgn) time.
Unfortunately, there's a lot of data manipulation going on in the
process and two IO passes are required to sort any given file.
Priority Queues do not appear to be very "IO friendly".

I know of no sorting performance benchmark contest winner based on
Priority Queues.

>Replacement selection is not a good idea any more, since obvious
>better ideas should take over. Longer runs are of no value if you do not
>have to do multiple merge passes.
>
Judging from the literature and the contest winners, Replacement
Selection is still a viable and important technique. Besides Priority
Queues, what "obvious better ideas" have you heard of?

>I have explained this general technique in the book "C Unleashed",
>chapter 13.
>
>Sample code is available on the book's home page.
>
URL please?

Browse pgsql-hackers by date

  From Date Subject
Next Message Martijn van Oosterhout 2005-09-24 11:13:00 Re: stack depth limit exceeded problem.
Previous Message Thomas Hallgren 2005-09-24 10:26:58 Re: stack depth limit exceeded problem.

Browse pgsql-performance by date

  From Date Subject
Next Message Ron Peacetree 2005-09-24 15:55:53 Re: Advice on RAID card
Previous Message PFC 2005-09-24 08:34:15 Advice on RAID card