Robert Haas wrote:
>On Fri, Jan 2, 2009 at 3:23 PM, Stephen R. van den Berg <srb(at)cuci(dot)nl> wrote:
>> Three things:
>> a. Shouldn't it in theory be possible to have a decompression algorithm
>> which is IO-bound because it decompresses faster than the disk can
>> supply the data? (On common current hardware).
>> b. Has the current algorithm been carefully benchmarked and/or optimised
>> and/or chosen to fit the IO-bound target as close as possible?
>> c. Are there any well-known pitfalls/objections which would prevent me from
>> changing the algorithm to something more efficient (read: IO-bound)?
>Any compression algorithm is going to require you to decompress the
>entire string before extracting a substring at a given offset. When
>the data is uncompressed, you can jump directly to the offset you want
>to read. Even if the compression algorithm requires no overhead at
>all, it's going to make the location of the data nondeterministic, and
>therefore force additional disk reads.
That shouldn't be insurmountable:
- I currently have difficulty imagining applications that actually do
lots of substring extractions from large compressible fields.
The most likely operation would be a table which contains tsearch
indexed large textfields, but those are unlikely to participate in
a lot of substring extractions.
- Even if substring operations would be likely, I could envision a compressed
format which compresses in compressed chunks of say 64KB which can then
be addressed randomly independently.
Stephen R. van den Berg.
"Always remember that you are unique. Just like everyone else."
In response to
pgsql-hackers by date
|Next:||From: Aidan Van Dyk||Date: 2009-01-02 21:33:34|
|Subject: Re: Several tags around PostgreSQL 7.1 broken|
|Previous:||From: Gregory Stark||Date: 2009-01-02 20:59:35|
|Subject: Re: Significantly larger toast tables on 8.4?|