Re: Merge algorithms for large numbers of "tapes"

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Greg Stark" <gsstark(at)mit(dot)edu>, "Dann Corbit" <DCorbit(at)connx(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Merge algorithms for large numbers of "tapes"
Date: 2006-03-09 22:35:02
Message-ID: C035ED96.1EE09%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Tom,

On 3/9/06 9:44 AM, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:

> I think this argumentation hinges on some irrational aversion to the
> word "tape". Given adequate work_mem, the CVS-tip behavior is exactly
> what you propose already (at least for the cases where we don't need
> random access to the sort result).

Nope. There's the matter of this thing called logtape.c, in addition to the
use of the "tape" as a means of grouping runs. In the current
implementation, runs are not tapes, and tapes as used in the implementation
are an abstraction that only obscures the underlying processes in a
meaningful way.

My objection to tapes is a rational one, and we have internally demonstrated
that by eliminating logtape.c and large hunks of tape algorithm related
code, we get slightly faster performance with 2,000 fewer lines of code,
ergo, the code is not useful. We did this in two days of work, and in the
process uncovered the fact that access was always set to RANDOM, the import
of which we've seen discussed here.

> AFAICS the only result of removing
> the support for multipass merge is that the code would fail, rather than
> run slowly, if it didn't have adequate work_mem for a particular
> problem. Somehow I don't see that as an improvement.

I would only suggest that we replace the existing algorithm with one that
will work regardless of (reasonable) memory requirements. Perhaps we can
agree that at least 1MB of RAM for external sorting will always be available
and proceed from there?

- Luke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-03-09 23:00:06 Re: Merge algorithms for large numbers of "tapes"
Previous Message Hannu Krosing 2006-03-09 22:31:48 Re: Proposal for SYNONYMS