Re: jsonb format is pessimal for toast compression

From: Josh Berkus <josh(at)agliodbs(dot)com>
To: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>
Cc: Claudio Freire <klaussfreire(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Jan Wieck <jan(at)wi3ck(dot)info>
Subject: Re: jsonb format is pessimal for toast compression
Date: 2014-09-16 19:24:21
Message-ID: 54188E65.6030304@agliodbs.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Heikki, Robert:

On 09/16/2014 11:12 AM, Heikki Linnakangas wrote:
> Are you looking for someone with a real life scenario, or just synthetic
> test case? The latter is easy to do.
>
> See attached test program. It's basically the same I posted earlier.
> Here are the results from my laptop with Tom's jsonb-lengths-merged.patch:

Thanks for that!

> postgres=# select * from testtimes ;
> elem | duration_ms
> ------+-------------
> 3674 | 0.530412
> 4041 | 0.552585
> 4445 | 0.581815

This looks like the level at which the difference gets to be really
noticeable. Note that this is completely swamped by the difference
between compressed vs. uncompressed though.

> With unpatched git master, the runtime is flat, regardless of which
> element is queried, at about 0.29 s. With
> jsonb-with-offsets-and-lengths-2.patch, there's no difference that I
> could measure.

OK, thanks.

> The difference starts to be meaningful at around 500 entries. In
> practice, I doubt anyone's going to notice until you start talking about
> tens of thousands of entries.
>
> I'll leave it up to the jury to decide if we care or not. It seems like
> a fairly unusual use case, where you push around large enough arrays or
> objects to notice. Then again, I'm sure *someone* will do it. People do
> strange things, and they find ways to abuse the features that the
> original developers didn't think of.

Right, but the question is whether it's worth having a more complex code
and data structure in order to support what certainly *seems* to be a
fairly obscure use-case, that is more than 4000 keys at the same level.
And it's not like it stops working or becomes completely unresponsive
at that level; it's just double the response time.

On 09/16/2014 12:20 PM, Robert Haas wrote:> Basically, I think that if
we make a decision to use Tom's patch
> rather than Heikki's patch, we're deciding that the initial decision,
> by the folks who wrote the original jsonb code, to make array access
> less than O(n) was misguided. While that could be true, I'd prefer to
> bet that those folks knew what they were doing. The only way reason
> we're even considering changing it is that the array of lengths
> doesn't compress well, and we've got an approach that fixes that
> problem while preserving the advantages of fast lookup. We should
> have a darn fine reason to say no to that approach, and "it didn't
> benefit my particular use case" is not it.

Do you feel that way *as a code maintainer*? That is, if you ended up
maintaining the JSONB code, would you still feel that it's worth the
extra complexity? Because that will be the main cost here.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-09-16 19:37:39 Re: jsonb format is pessimal for toast compression
Previous Message Robert Haas 2014-09-16 19:20:11 Re: jsonb format is pessimal for toast compression