From: | Greg Stark <gsstark(at)mit(dot)edu> |
---|---|
To: | "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com> |
Cc: | Greg Stark <gsstark(at)mit(dot)edu>, Luke Lonergan <llonergan(at)greenplum(dot)com>, Dann Corbit <DCorbit(at)connx(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Simon Riggs <simon(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org |
Subject: | Re: Merge algorithms for large numbers of "tapes" |
Date: | 2006-03-09 03:20:08 |
Message-ID: | 87acc0mhhz.fsf@stark.xeocode.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
"Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> On Wed, Mar 08, 2006 at 06:55:59PM -0500, Greg Stark wrote:
> >
> > "Luke Lonergan" <llonergan(at)greenplum(dot)com> writes:
> >
> > > > I am pretty sure from this thread that PostgreSQL is not doing #1, and I
> > > > have no idea if it is doing #2.
> > >
> > > Yep. Even Knuth says that the tape goo is only interesting from a
> > > historical perspective and may not be relevant in an era of disk drives.
> >
> > As the size of the data grows larger the behaviour of hard drives looks more
> > and more like tapes. The biggest factor controlling the speed of i/o
> > operations is how many seeks are required to complete them. Effectively
> > "rewinds" are still the problem it's just that the cost of rewinds becomes
> > constant regardless of how long the "tape" is.
>
> But it will take a whole lot of those rewinds to equal the amount of
> time required by an additional pass through the data. I'll venture a
> guess that as long as you've got enough memory to still read chunks back
> in 8k blocks that it won't be possible for a multi-pass sort to
> out-perform a one-pass sort.
Well that's clearly a bit overoptimistic. If we believe the random page cost
of 4 then having more tapes than you have spindles would impose a penalty
equal to having four times as many passes.
(And that's *with* the 8k block size. And with the kernel performing pre-fetch
already too.)
> For that to be of any use, wouldn't you need to use only as many tapes
> as spindles/2? Otherwise you're still trying to read and write from the
> same set of drives, which means you're probably doing a lot of seeking.
> Or do the tape algorithms re-write data as they read it?
Well, spindles-1. I was thinking as many tapes as you have spindles *in total*,
ie, including the output tape. You only have one output tape for each n-way
merge though.
--
greg
From | Date | Subject | |
---|---|---|---|
Next Message | Marc G. Fournier | 2006-03-09 03:27:33 | Re: [COMMITTERS] pgsql: Remove Christof Petig copyright |
Previous Message | Jim C. Nasby | 2006-03-09 03:18:51 | Re: [COMMITTERS] pgsql: Remove Christof Petig copyright on include file, per author |