Re: Performance of JSON type in postgres

From: veem v <veema0000(at)gmail(dot)com>
To: Merlin Moncure <mmoncure(at)gmail(dot)com>, Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>, "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: pgsql-general <pgsql-general(at)lists(dot)postgresql(dot)org>
Subject: Re: Performance of JSON type in postgres
Date: 2025-07-19 20:39:56
Message-ID: CAB+=1TWwerQMVQvOP0oAVunTUz8tnX425Jn-84DBYbOD+us9Eg@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-general

On Tue, 15 Jul 2025 at 23:02, Merlin Moncure <mmoncure(at)gmail(dot)com> wrote:

> On Mon, Jul 14, 2025 at 2:01 PM David G. Johnston <
> david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
>
>> On Mon, Jul 14, 2025 at 12:54 PM Adrian Klaver <adrian(dot)klaver(at)aklaver(dot)com>
>> wrote:
>>
>>> On 7/14/25 12:51, veem v wrote:
>>> > So I want to
>>> > understand the experts' opinion on this which I believe will be
>>> > crucial during design itself.
>>>
>>> It is spelled out here:
>>>
>>> https://www.postgresql.org/docs/current/datatype-json.html
>>>
>>>
>> I've taken to heart the main takeaway from that page:
>>
>> "In general, most applications should prefer to store JSON data as jsonb,
>> unless there are quite specialized needs, such as legacy assumptions about
>> ordering of object keys."
>>
>
> I don't think the documentation is accurate at all, unless one of those
> specialized needs is to 'be faster'. json serialization is more than 2x
> faster based on simple testing (see below). This is absolutely not a
> trivial difference.
>
> I would say, use json for serialization, use jsonb for data storage,
> unless the precise structure of the input document is important.
>
> merlin
>
> leaselock_iam(at)leaselock_prod=> explain analyze select json_agg(l) from (
> select l from llcore.lease l limit 10000) q;
> QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=405.52..405.53 rows=1 width=32) (actual
> time=69.043..69.048 rows=1 loops=1)
> -> Limit (cost=0.00..380.52 rows=10000 width=247) (actual
> time=0.017..9.764 rows=10000 loops=1)
> -> Seq Scan on lease l (cost=0.00..100383.89 rows=2638089
> width=247) (actual time=0.016..8.831 rows=10000 loops=1)
> Planning Time: 0.109 ms
> Execution Time: 69.088 ms
> (5 rows)
>
> Time: 160.560 ms
> leaselock_iam(at)leaselock_prod=> explain analyze select jsonb_agg(l) from (
> select l from llcore.lease l limit 10000) q;
> QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------------------------------
> Aggregate (cost=405.52..405.53 rows=1 width=32) (actual
> time=146.139..146.141 rows=1 loops=1)
> -> Limit (cost=0.00..380.52 rows=10000 width=247) (actual
> time=0.017..20.837 rows=10000 loops=1)
> -> Seq Scan on lease l (cost=0.00..100383.89 rows=2638089
> width=247) (actual time=0.016..19.975 rows=10000 loops=1)
> Planning Time: 0.108 ms
> Execution Time: 152.277 ms
>
>

Thank you.

I tested below for sample data. I see loading or serialization seems a lot
slower(twice as slower) in JSONB as compared to JSON. Whereas storage looks
efficient in JSONB. and reading performance of nested fields are 7-8 times
slower in JSON as compared to JSONB(and ofcourse index support makes it a
better choice here). Hope i am testing it correctly here.

https://dbfiddle.uk/6P7sjL22

So I am a bit confused here . Also one of our use case is, along with
persisting this data and querying it in postgres database, We are also
going to move this data from postgres (which is a upstream OLTP system) to
a downstream OLAP system ,which is in Snowflake database which is having
data types like Variant or Varchar types. So, will it create a significant
difference if we store it in JSON vs JSONB in our postgres i.e the
source/upstream database?

In response to

Responses

Browse pgsql-general by date

  From Date Subject
Next Message Adrian Klaver 2025-07-19 20:59:36 Re: Performance of JSON type in postgres
Previous Message Jon Zeppieri 2025-07-18 20:29:24 Possible causes of high_replay lag, given replication settings?