Skip site navigation (1) Skip section navigation (2)

pg_lzcompress strategy parameters

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gregory Stark <stark(at)enterprisedb(dot)com>, Jan Wieck <JanWieck(at)Yahoo(dot)com>
Cc: pgsql-hackers(at)postgreSQL(dot)org
Subject: pg_lzcompress strategy parameters
Date: 2007-08-04 22:19:30
Message-ID: 8566.1186265970@sss.pgh.pa.us (view raw or flat)
Thread:
Lists: pgsql-hackers
Greg complained here
http://archives.postgresql.org/pgsql-patches/2007-07/msg00342.php
that the default strategy parameters used by the TOAST compressor
might need some adjustment.  After thinking about it a little I wonder
whether they're not even more broken than that.  The present behavior
is:

1. Never compress for inputs < min_input_size (256 bytes by default).
2. Compress inputs >= force_input_size (6K by default), as long as
   compression produces a result at least 1 byte smaller than the input.
3. For inputs between min_input_size and force_input_size, compress only
   if compression of at least min_comp_rate percent is achieved
   (20% by default).

This whole structure seems a bit broken, independently of whether the
particular parameter values are good.  If the compressor is given an
input of 1000000 bytes and manages to compress it to 999999 bytes,
we'll store it compressed, and pay for decompression cycles on every
access, even though the I/O savings are nonexistent.  That's not sane.

I'm inclined to think that the concept of force_input_size is wrong.
Instead I suggest that we have a min_comp_rate (minimum percentage
savings) and a min_savings (minimum absolute savings), and compress
if either one is met.  For instance, with min_comp_rate = 10% and
min_savings = 1MB, then for inputs below 10MB you'd require at least
10% savings to compress them, but for inputs above 10MB you'd require
at least 1MB saved to compress.

Or maybe it should just be a min_comp_rate and nothing else.
Compressing a 1GB field to 999MB is probably not very sane either.

This is all independent of what the specific parameter settings should
be, but I concur with Greg that those could do with a fresh look.

Thoughts?

			regards, tom lane

Responses

pgsql-hackers by date

Next:From: Joshua D. DrakeDate: 2007-08-05 01:21:06
Subject: Re: pg_lzcompress strategy parameters
Previous:From: Tom LaneDate: 2007-08-04 21:31:08
Subject: Re: Document and/or remove unreachable code in tuptoaster.c from varvarlena patch

Privacy Policy | About PostgreSQL
Copyright © 1996-2014 The PostgreSQL Global Development Group