Re: jsonb format is pessimal for toast compression

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Claudio Freire <klaussfreire(at)gmail(dot)com>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Jan Wieck <jan(at)wi3ck(dot)info>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-09-17 03:45:32
Message-ID: 3365.1410925532@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki Linnakangas <hlinnakangas(at)vmware(dot)com> writes:
> On 09/16/2014 10:37 PM, Robert Haas wrote:
>> On Tue, Sep 16, 2014 at 3:24 PM, Josh Berkus <josh(at)agliodbs(dot)com> wrote:
>>> Do you feel that way *as a code maintainer*? That is, if you ended up
>>> maintaining the JSONB code, would you still feel that it's worth the
>>> extra complexity? Because that will be the main cost here.

>> I feel that Heikki doesn't have a reputation for writing or committing
>> unmaintainable code.
>> I haven't reviewed the patch.

> The patch I posted was not pretty, but I'm sure it could be refined to
> something sensible.

We're somewhat comparing apples and oranges here, in that I pushed my
approach to something that I think is of committable quality (and which,
not incidentally, fixes some existing bugs that we'd need to fix in any
case); while Heikki's patch was just proof-of-concept. It would be worth
pushing Heikki's patch to committable quality so that we had a more
complete understanding of just what the complexity difference really is.

> There are many possible variations of the basic scheme of storing mostly
> lengths, but an offset for every N elements. I replaced the length with
> offset on some element and used a flag bit to indicate which it is.

Aside from the complexity issue, a demerit of Heikki's solution is that it
eats up a flag bit that we may well wish we had back later. On the other
hand, there's definitely something to be said for not breaking
pg_upgrade-ability of 9.4beta databases.

> Perhaps a simpler approach would be to store lengths, but also store a
> separate smaller array of offsets, after the lengths array.

That way would also give up on-disk compatibility, and I'm not sure it's
any simpler in practice than your existing solution.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Emre Hasegeli 2014-09-17 08:12:50 Re: KNN-GiST with recheck
Previous Message Craig Ringer 2014-09-17 02:40:14 Re: New to PostGre SQL asking about write-ahead-log (WAL)