From: | Martijn van Oosterhout <kleptog(at)svana(dot)org> |
---|---|
To: | Robert Haas <robertmhaas(at)gmail(dot)com> |
Cc: | "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>, Alex Hunsaker <badalex(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: Significantly larger toast tables on 8.4? |
Date: | 2009-01-02 22:48:15 |
Message-ID: | 20090102224815.GA29489@svana.org |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, Jan 02, 2009 at 03:35:18PM -0500, Robert Haas wrote:
> Any compression algorithm is going to require you to decompress the
> entire string before extracting a substring at a given offset. When
> the data is uncompressed, you can jump directly to the offset you want
> to read. Even if the compression algorithm requires no overhead at
> all, it's going to make the location of the data nondeterministic, and
> therefore force additional disk reads.
So you compromise. You split the data into say 1MB blobs and compress
each individually. Then if someone does a substring at offset 3MB you
can find it quickly. This barely costs you anything in the compression
ratio mostly.
Implementation though, that's harder. The size of the blobs is tunable
also. I imagine the optimal value will probably be around 100KB. (12
blocks uncompressed).
Have a nice day,
--
Martijn van Oosterhout <kleptog(at)svana(dot)org> http://svana.org/kleptog/
> Please line up in a tree and maintain the heap invariant while
> boarding. Thank you for flying nlogn airlines.
From | Date | Subject | |
---|---|---|---|
Next Message | Greg Stark | 2009-01-02 23:02:17 | Re: posix_fadvise v22 |
Previous Message | Joe Conway | 2009-01-02 22:39:20 | Re: dblink vs SQL/MED |