Re: Merge algorithms for large numbers of "tapes"

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Merge algorithms for large numbers of "tapes"
Date: 2006-03-08 18:49:16
Message-ID: C034672C.1EC73%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Jim,

On 3/8/06 9:49 AM, "Jim C. Nasby" <jnasby(at)pervasive(dot)com> wrote:

> On Wed, Mar 08, 2006 at 11:20:50AM -0500, Tom Lane wrote:

>> Not sure that follows. In particular, the entire point of the recent
>> changes has been to extend the range in which we can use a single merge
>> pass --- that is, write the data once as N sorted runs, then merge them
>> in a single read pass. As soon as you have to do an actual merge-back-
>> to-disk pass, your total I/O volume doubles, so there is definitely a
>> considerable gain if that can be avoided. And a larger work_mem
>> translates directly to fewer/longer sorted runs.
>
> But do fewer/longer sorted runs translate into not merging back to disk?
> I thought that was controlled by if we had to be able to rewind the
> result set.

In the *tape* algorithm, there is an intermediate abstraction in the merging
called tapes (!) that are used to store intermediate merge results. Simon's
work implemented more tapes, which asymptotically approaches a single merge
pass as the number of tapes approaches the number of runs.

The Replacement Selection algorithm generally will produce about 1/2 the
number of runs that a simpler partial sort algorithm would, and the more
memory it uses, the fewer runs there are, and with fewer runs, fewer tapes
are required to avoid more passes on the merge.

This whole tape abstraction is something that I believe is unique to
Postgres among modern databases, and we have found that by removing it
entirely along with logtape.c, we remove 2000 lines of useless code that
only complicates our optimization problem.

- Luke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-03-08 20:32:09 Re: Merge algorithms for large numbers of "tapes"
Previous Message Joachim Wieland 2006-03-08 18:03:56 Re: Status of TODO: postgresql.conf: reset to default when