Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)

From: Gregory Stark <stark(at)enterprisedb(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Douglas McNaught" <doug(at)mcnaught(dot)org>, "Stephen R(dot) van den Berg" <srb(at)cuci(dot)nl>, "Alvaro Herrera" <alvherre(at)commandprompt(dot)com>, lar(at)quicklz(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: QuickLZ compression algorithm (Re: Inclusion in the PostgreSQL backend for toasting rows)
Date: 2009-01-06 07:47:24
Message-ID: 87wsd8j2pf.fsf@oxford.xeocode.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


> "Robert Haas" <robertmhaas(at)gmail(dot)com> writes:
>
>> not compressing very small datums (< 256 bytes) also seems smart,
>> since that could end up producing a lot of extra compression attempts,
>> most of which will end up saving little or no space.

That was presumably the rationale for the original logic. However experience
shows that there are certainly databases that store a lot of compressible
short strings.

Obviously databases with CHAR(n) desperately need us to compress them. But
even plain text data are often moderately compressible even with our fairly
weak compression algorithm.

One other thing that bothers me about our toast mechanism is that it only
kicks in for tuples that are "too large". It seems weird that the same column
is worth compressing or not depending on what other columns are in the same
tuple.

If you store a 2000 byte tuple that's all spaces we don't try to compress it
at all. But if you added one more attribute we would go to great lengths
compressing and storing attributes externally -- not necessarily the attribute
you just added, the ones that were perfectly fine previously -- to try to get
it under 2k.

--
Gregory Stark
EnterpriseDB http://www.enterprisedb.com
Ask me about EnterpriseDB's RemoteDBA services!

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message KaiGai Kohei 2009-01-06 08:08:50 Updates of SE-PostgreSQL 8.4devel patches (r1389)
Previous Message Pavel Stehule 2009-01-06 06:45:21 Re: [HACKERS] ERROR: failed to find conversion function from "unknown" to text