Re: pg_lzcompress strategy parameters

From: Jan Wieck <JanWieck(at)Yahoo(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Gregory Stark <stark(at)enterprisedb(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_lzcompress strategy parameters
Date: 2007-08-08 16:54:49
Message-ID: 46B9F559.3070403@Yahoo.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 8/5/2007 6:30 PM, Tom Lane wrote:
> Gregory Stark <stark(at)enterprisedb(dot)com> writes:
>> (Incidentally, this means what I said earlier about uselessly trying to
>> compress objects below 256 is even grosser than I realized. If you have a
>> single large object which even after compressing will be over the toast target
>> it will force *every* varlena to be considered for compression even though
>> they mostly can't be compressed. Considering a varlena smaller than 256 for
>> compression only costs a useless palloc, so it's not the end of the world but
>> still. It does seem kind of strange that a tuple which otherwise wouldn't be
>> toasted at all suddenly gets all its fields compressed if you add one more
>> field which ends up being stored externally.)
>
> Yeah. It seems like we should modify the first and third loops so that
> if (after compression if any) the largest attribute is *by itself*
> larger than the target threshold, then we push it out to the toast table
> immediately, rather than continuing to compress other fields that might
> well not need to be touched.

I agree with the general lack of sanity in the logic and think this one
is a good starter.

Another optimization to think about would eventually be to let the
compressor abort the attempt after the first X bytes had to be copied
literally. People do have the possibility to disable compression on a
per column base, but how many actually do so? and if the first 100,000
bytes of a 10M attribute can't be compressed, it is very likely that the
input is compressed already.

Jan

--
#======================================================================#
# It's easier to get forgiveness for being wrong than for being right. #
# Let's break this rule - forgive me. #
#================================================== JanWieck(at)Yahoo(dot)com #

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Decibel! 2007-08-08 17:00:47 Re: HOT patch, missing things
Previous Message Decibel! 2007-08-08 16:41:04 Re: [mmoncure@gmail.com: Re: [GENERAL] array_to_set functions]