Re: Select count(*), the sequel

From: "Pierre C" <lists(at)peufeu(dot)com>
To: "Robert Haas" <robertmhaas(at)gmail(dot)com>, "Tom Lane" <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "Kevin Grittner" <Kevin(dot)Grittner(at)wicourts(dot)gov>
Cc: pgsql-performance(at)postgresql(dot)org, "Kenneth Marshall" <ktm(at)rice(dot)edu>, "Mladen Gogala" <mladen(dot)gogala(at)vmsinfo(dot)com>
Subject: Re: Select count(*), the sequel
Date: 2010-10-28 09:33:16
Message-ID: op.vk94tqzceorkce@apollo13
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

> "Pierre C" <lists(at)peufeu(dot)com> wrote:
>
>> in-page compression
> How would that be different from the in-page compression done by
> TOAST now? Or are you just talking about being able to make it
> more aggressive?
> -Kevin

Well, I suppose lzo-style compression would be better used on data that is
written a few times maximum and then mostly read (like a forum, data
warehouse, etc). Then, good candidate pages for compression also probably
have all tuples visible to all transactions, therefore all row headers
would be identical and would compress very well. Of course this introduces
a "small" problem for deletes and updates...

Delta compression is : take all the values for a column inside a page,
look at the values and their statistical distribution, notice for example
that they're all INTs and the values on the page fit between X+n and X-n,
store X and only encode n with as few bits as possible for each row. This
is only an example, the idea is to exploit the fact that on the same page,
all the values of one column often have lots in common. xid values in row
headers are a good example of this.

TOAST compresses datums, so it performs well on large datums ; this is the
opposite, the idea is to compress small tuples by using the reduncancies
between tuples.

In response to

Browse pgsql-performance by date

  From Date Subject
Next Message André Volpato 2010-10-28 12:33:07 Re: AIX slow buffer reads
Previous Message Trenta sis 2010-10-28 08:16:20 Massive update, memory usage