Re: Merge algorithms for large numbers of "tapes"

From: Greg Stark <gsstark(at)mit(dot)edu>
To: "Luke Lonergan" <llonergan(at)greenplum(dot)com>
Cc: "Dann Corbit" <DCorbit(at)connx(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>, "Simon Riggs" <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Merge algorithms for large numbers of "tapes"
Date: 2006-03-08 23:55:59
Message-ID: 87fylsmqy8.fsf@stark.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


"Luke Lonergan" <llonergan(at)greenplum(dot)com> writes:

> > I am pretty sure from this thread that PostgreSQL is not doing #1, and I
> > have no idea if it is doing #2.
>
> Yep. Even Knuth says that the tape goo is only interesting from a
> historical perspective and may not be relevant in an era of disk drives.

As the size of the data grows larger the behaviour of hard drives looks more
and more like tapes. The biggest factor controlling the speed of i/o
operations is how many seeks are required to complete them. Effectively
"rewinds" are still the problem it's just that the cost of rewinds becomes
constant regardless of how long the "tape" is.

That's one thing that gives me pause about the current approach of using more
tapes. It seems like ideally the user would create a temporary work space on
each spindle and the database would arrange to use no more than that number of
tapes. Then each merge operation would involve only sequential access for both
reads and writes.

--
greg

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Dann Corbit 2006-03-09 00:18:39 Re: Merge algorithms for large numbers of "tapes"
Previous Message Greg Stark 2006-03-08 23:42:45 Re: Coverity Open Source Defect Scan of PostgreSQL