Re: jsonb format is pessimal for toast compression

From: Marti Raudsepp <marti(at)juffo(dot)org>
To: Hannu Krosing <hannu(at)2ndquadrant(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Larry White <ljw1001(at)gmail(dot)com>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-08-12 12:41:44
Message-ID: CABRT9RDKfOF7+8gonQggcPXSvu8TwXOTGJKvV4=u=SHBq8Dspg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Aug 8, 2014 at 10:50 PM, Hannu Krosing <hannu(at)2ndquadrant(dot)com> wrote:
> How hard and how expensive would it be to teach pg_lzcompress to
> apply a delta filter on suitable data ?
>
> So that instead of integers their deltas will be fed to the "real"
> compressor

Has anyone given this more thought? I know this might not be 9.4
material, but to me it sounds like the most promising approach, if
it's workable. This isn't a made up thing, the 7z and LZMA formats
also have an optional delta filter.

Of course with JSONB the problem is figuring out which parts to apply
the delta filter to, and which parts not.

This would also help with integer arrays, containing for example
foreign key values to a serial column. There's bound to be some
redundancy, as nearby serial values are likely to end up close
together. In one of my past projects we used to store large arrays of
integer fkeys, deliberately sorted for duplicate elimination.

For an ideal case comparison, intar2 could be as large as intar1 when
compressed with a 4-byte wide delta filter:

create table intar1 as select array(select 1::int from
generate_series(1,1000000)) a;
create table intar2 as select array(select generate_series(1,1000000)::int) a;

In PostgreSQL 9.3 the sizes are:
select pg_column_size(a) from intar1;
45810
select pg_column_size(a) from intar2;
4000020

So a factor of 87 difference.

Regards,
Marti

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Heikki Linnakangas 2014-08-12 12:53:44 Re: SSL regression test suite
Previous Message Andres Freund 2014-08-12 11:28:55 Re: SSL regression test suite