Re: Compression and on-disk sorting

From: "Jim C(dot) Nasby" <jnasby(at)pervasive(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Greg Stark <gsstark(at)mit(dot)edu>, Martijn van Oosterhout <kleptog(at)svana(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Rod Taylor <pg(at)rbt(dot)ca>, "Bort, Paul" <pbort(at)tmwsystems(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Compression and on-disk sorting
Date: 2006-05-17 15:45:04
Message-ID: 20060517154504.GR26212@pervasive.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers pgsql-patches

On Wed, May 17, 2006 at 11:38:05AM -0400, Tom Lane wrote:
> "Jim C. Nasby" <jnasby(at)pervasive(dot)com> writes:
> > What *might* make sense would be to provide two locations for pgsql_tmp,
> > because a lot of operations in there involve reading and writing at the
> > same time:
>
> > Read from heap while writing tapes to pgsql_tmp
> > read from tapes while writing final version to pgsql_tmp
>
> Note that a large part of the reason for the current logtape.c design
> is to avoid requiring 2X or more disk space to sort X amount of data.
> AFAICS, any design that does the above will put us right back in the 2X
> regime. That's a direct, measurable penalty; it'll take more than
> handwaving arguments to convince me we should change it in pursuit of
> unquantified speed benefits.

I certainly agree that there's no reason to make this change without
testing, but if you've got enough spindles laying around to actually
make use of this I suspect that requiring 2X the disk space to sort X
won't bother you.

Actually, I suspect in most cases it won't matter; I don't think people
make a habit of trying to sort their entire database. :) But we'd want
to protect for the oddball cases... yech.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby(at)pervasive(dot)com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephan Szabo 2006-05-17 15:54:08 Re: Foreign key column reference ordering and information_schema
Previous Message Zeugswetter Andreas DCP SD 2006-05-17 15:43:39 Re: pg_dump and backslash escapes

Browse pgsql-patches by date

  From Date Subject
Next Message Tom Lane 2006-05-17 15:57:24 Re: Compression and on-disk sorting
Previous Message Tom Lane 2006-05-17 15:38:05 Re: Compression and on-disk sorting