Re: Significantly larger toast tables on 8.4?

From: Martijn van Oosterhout <kleptog(at)svana(dot)org>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>, Alex Hunsaker <badalex(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Significantly larger toast tables on 8.4?
Date: 2009-01-02 22:48:15
Message-ID: 20090102224815.GA29489@svana.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Jan 02, 2009 at 03:35:18PM -0500, Robert Haas wrote:
> Any compression algorithm is going to require you to decompress the
> entire string before extracting a substring at a given offset. When
> the data is uncompressed, you can jump directly to the offset you want
> to read. Even if the compression algorithm requires no overhead at
> all, it's going to make the location of the data nondeterministic, and
> therefore force additional disk reads.

So you compromise. You split the data into say 1MB blobs and compress
each individually. Then if someone does a substring at offset 3MB you
can find it quickly. This barely costs you anything in the compression
ratio mostly.

Implementation though, that's harder. The size of the blobs is tunable
also. I imagine the optimal value will probably be around 100KB. (12
blocks uncompressed).

Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Greg Stark 2009-01-02 23:02:17 Re: posix_fadvise v22
Previous Message Joe Conway 2009-01-02 22:39:20 Re: dblink vs SQL/MED