Re: jsonb and nested hstore

From: Andrew Dunstan <andrew(at)dunslane(dot)net>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>
Cc: Josh Berkus <josh(at)agliodbs(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Teodor Sigaev <teodor(at)sigaev(dot)ru>
Subject: Re: jsonb and nested hstore
Date: 2014-02-24 19:15:51
Message-ID: 530B9A67.3050800@dunslane.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 02/24/2014 11:06 AM, Merlin Moncure wrote:
> On Mon, Feb 24, 2014 at 9:08 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>> On Mon, Feb 24, 2014 at 8:46 AM, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:
>>> I still find the phrasing "as jsonb is more efficient for most
>>> purposes" to be a bit off Basically, the text json type is faster for
>>> serialization/deserialization pattern (not just document preservation)
>>> and jsonb is preferred when storing json and doing repeated
>>> subdocument accesses.
>> Hm, I'm going to withdraw that. I had done some testing of simple
>> deserialization (cast to text and the like) and noted that jsonb was
>> as much as 5x slower. However, I just did some checking on
>> json[b]_populate_recordset though and it's pretty much a wash.
> [sorry for noise on this].
>
> Here's the use case coverage as I see it today:
>
> CASE: json jsonb hstore
> Static document: yes poor poor
> Precise document: yes no no
> Serialization: yes no no****
> Deserialization: poor*** yes* no****
> Repeated Access: poor yes yes
> Manipulation: no no** yes
> GIST/GIN searching: no no** yes
>
> notes:
> * jsonb gets 'yes' for deserialization assuming andrew's 'two level'
> deserialization fix goes in (otherwise 'poor').
> ** jsonb can't do this today, but presumably will be able to soon
> *** 'poor' unless json type also gets the deserialization fix, then 'yes'.
> **** hstore can deserialize hstore format, but will rely on json/jsonb
> for deserializing json
>
> 'Static document' represents edge cases where the json is opaque to
> the database but performance -- for example large map polygons.
> 'Precise document' represents cases where whitespace or key order is important.
>
> Peter asked upthread how to access the various features. Well, today,
> it basically means a bit of nimble casting to different structures
> depending on which particular features are important to you, which
> IMNSHO is not bad at all as long as we understand that most people who
> rely on jsonb will also need hstore for its searching and operators.
> Down the line when hstore and jsonb are more flushed out it's going to
> come down to an API style choice.
>

Frankly, a lot of the above doesn't make much sense to me. WTF is
"Manipulation'?

Unless I see much more actual info on the tests being conducted it's
just about impossible to comment. The performance assessment at this
stage is simply anecdotal as far as I'm concerned.

populate_record() is likely to be a *very* poor point of comparison
anyway, I would expect the performance numbers to be dominated by the
input function calls for the object's component types, and that's going
to be the same in both cases. If you want to prove something here you'll
need to supply profiling numbers showing where it spends its time in
each case.

Having had my schedule very seriously disrupted by the storm in the US
South East a week or so ago, I am finally getting back to being able to
devote some time to jsonb. I hope to have new patches available today or
tomorrow at the latest.

cheers

andrew

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Christian Kruse 2014-02-24 19:49:05 Re: Patch: show xid and xmin in pg_stat_activity and pg_stat_replication
Previous Message Bruce Momjian 2014-02-24 18:28:45 Re: psql should show disabled internal triggers