Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>
Cc: "Douglas McNaught" <doug(at)mcnaught(dot)org>, "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, lar(at)quicklz(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)
Date: 2009-01-06 04:58:25
Message-ID: 3190.1231217905@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

"Robert Haas" <robertmhaas(at)gmail(dot)com> writes:
> After reading these discussions, I guess I still don't understand why
> we would treat small and large datums differently. It seems to me
> that you had it about right here:
> http://archives.postgresql.org/pgsql-hackers/2007-08/msg00082.php
> # Or maybe it should just be a min_comp_rate and nothing else.
> # Compressing a 1GB field to 999MB is probably not very sane either.

Well, that's okay with me. I think that the other discussion was mainly
focused on the silliness of compressing large datums when only a small
percentage could be saved.

What we might do for the moment is just to set the upper limit to
INT_MAX in the default strategy, rather than rip out the logic
altogether. IIRC that limit is checked only once per compression,
not in the inner loop, so it won't cost us any noticeable performance
to leave the logic there in case someone finds a use for it.

> not compressing very small datums (< 256 bytes) also seems smart,
> since that could end up producing a lot of extra compression attempts,
> most of which will end up saving little or no space.

But note that the current code will usually not try to do that anyway,
at least for rows of ordinary numbers of columns.

The present code has actually reduced the lower-bound threshold from
where it used to be. I think that if anyone wants to argue for a
different value, it'd be time to whip out some actual tests ;-).
We can't set specific parameter values from gedanken-experiments.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Gurjeet Singh 2009-01-06 06:23:07 ERROR: failed to find conversion function from "unknown" to text
Previous Message Bruce Momjian 2009-01-06 04:07:08 Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)