Re: jsonb format is pessimal for toast compression

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Andrew Dunstan <andrew(at)dunslane(dot)net>
Cc: pgsql-hackers(at)postgreSQL(dot)org, Larry White <ljw1001(at)gmail(dot)com>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-08-08 15:54:24
Message-ID: 11062.1407513264@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Andrew Dunstan <andrew(at)dunslane(dot)net> writes:
> On 08/08/2014 11:18 AM, Tom Lane wrote:
>> That's not really the issue here, I think. The problem is that a
>> relatively minor aspect of the representation, namely the choice to store
>> a series of offsets rather than a series of lengths, produces
>> nonrepetitive data even when the original input is repetitive.

> It would certainly be worth validating that changing this would fix the
> problem.
> I don't know how invasive that would be - I suspect (without looking
> very closely) not terribly much.

I took a quick look and saw that this wouldn't be that easy to get around.
As I'd suspected upthread, there are places that do random access into a
JEntry array, such as the binary search in findJsonbValueFromContainer().
If we have to add up all the preceding lengths to locate the corresponding
value part, we lose the performance advantages of binary search. AFAICS
that's applied directly to the on-disk representation. I'd thought
perhaps there was always a transformation step to build a pointer list,
but nope.

regards, tom lane

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message John W Higgins 2014-08-08 16:04:01 Re: jsonb format is pessimal for toast compression
Previous Message David Rowley 2014-08-08 15:43:00 Defining a foreign key with a duplicate column is broken