Re: Merge algorithms for large numbers of "tapes"

From: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
To: "Dann Corbit" <DCorbit(at)connx(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
Cc: "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgreSQL(dot)org
Subject: Re: Merge algorithms for large numbers of "tapes"
Date: 2006-03-08 21:51:30
Message-ID: C03491E2.1ECAD%llonergan@greenplum.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Dann,

On 3/8/06 12:39 PM, "Dann Corbit" <DCorbit(at)connx(dot)com> wrote:

> Here are some suggestions of things that I know work really, really
> well:

Can you point to an example? That might help move the discussion along.

The reason to interject about the tape goo in this discussion is that we
seem to be spending a lot of time optimizing around the tape goo without
tackling the overall structure of the external sort. I think we'll just end
up having to replace all of this goo when we really get around to fixing the
problem.

Add to this that other commercial databases external sort in 1/4 the time or
better on the same hardware with the same CPU/memory resources using a
2-pass external sort.

> #1. Two pass merge (none of that silly poly-tape merge goo)

Voice of reason here. It's what the other database systems do.

> #2. Load ONLY the keys that are to be sorted into memory. Use a
> pointer exchange sort, and do not move the physical rows of data at all.

Sounds right. Example of this in practice?

> I am pretty sure from this thread that PostgreSQL is not doing #1, and I
> have no idea if it is doing #2.

Yep. Even Knuth says that the tape goo is only interesting from a
historical perspective and may not be relevant in an era of disk drives.

- Luke

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Neil Conway 2006-03-08 21:57:52 Re: [PATCHES] Add switches for DELIMITER and NULL in pg_dump COPY
Previous Message Bruce Momjian 2006-03-08 21:45:14 Re: Problemas with gram.y