Re: Merge algorithms for large numbers of "tapes"

From: "Dann Corbit" <DCorbit(at)connx(dot)com>
To: "Jonah H(dot) Harris" <jonah(dot)harris(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Merge algorithms for large numbers of "tapes"
Date: 2006-03-08 03:44:17
Message-ID: D425483C2C5C9F49B5B7A41F8944154757D5DC@postal.corporate.connx.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Use a priority queue for the sorted sub-lists. When the key-object
extracted from the head of the smallest queue exceeds the key-object
from the head of the second queue, adjust the priority of the smallest
queue within the list of queues.

It uses a total of 2 read/write passes over the data, no matter how many
subfiles you have. It is dominatingly faster than any other sort of
external merge when you have lots of subfiles.

I posted some message to the list on this subject before, and gave a
pointer to sample code that demonstrates the concept.

If you have one million sub-files, it still only takes a total of two
read-write passes.

________________________________

From: pgsql-hackers-owner(at)postgresql(dot)org
[mailto:pgsql-hackers-owner(at)postgresql(dot)org] On Behalf Of Jonah H. Harris
Sent: Tuesday, March 07, 2006 5:28 PM
To: Tom Lane
Cc: Simon Riggs; pgsql-hackers(at)postgresql(dot)org
Subject: Re: [HACKERS] Merge algorithms for large numbers of "tapes"

On 3/7/06, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

BTW, I was just looking over Knuth's discussion of sorting
again, and
realized that there is still something more that could be done
within
the existing sort framework. We currently use standard
polyphase merge
(his Algorithm 5.4.2D), which IIRC I chose because it was simple
and
for relatively small numbers of tapes T it was about as good as
anything
else. Knuth spends a great deal of energy on minimizing tape
rewind
time which of course is of no interest to us, and I had supposed
that
all of his more-complex algorithms were really only of interest
if you
needed to consider rewind time. However, now that we've changed
the
code to prefer large numbers of tapes, it's not at all clear
that
Algorithm D is still the right one to use. In particular I'm
looking at
cascade merge, Algorithm 5.4.3C, which appears to use
significantly
fewer passes when T is large. Do you want to try that?

I haven't personally played with this algorithm but having spent the
last 15 minutes reading it over, it does sound like an interesting idea
for trial. At first glance it didn't seem much better than polyphase
for our case, but after reading the entire algorithm, discussion, and
thinking it over for a couple minutes, I could see it as potentially
better.

Guess we won't really know 'til it can be tested :)

--
Jonah H. Harris, Database Internals Architect
EnterpriseDB Corporation
732.331.1324

Browse pgsql-hackers by date

  From Date Subject
Next Message Luke Lonergan 2006-03-08 03:53:13 Re: Merge algorithms for large numbers of "tapes"
Previous Message Bruce Momjian 2006-03-08 03:25:06 Re: Problemas with gram.y