From: | Rick Otten <rottenwindfish(at)gmail(dot)com> |
---|---|
To: | Nico Heller <nico(dot)heller(at)posteo(dot)de>, "pgsql-performa(dot)" <pgsql-performance(at)postgresql(dot)org> |
Subject: | Re: to_jsonb performance on array aggregated correlated subqueries |
Date: | 2022-08-12 20:17:02 |
Message-ID: | CAMAYy4KZEVsxCdz+iQEw4vOEABOD2psjnPAGQEYHg_6EMkQ8VA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-performance |
On Fri, Aug 12, 2022 at 3:02 PM Rick Otten <rottenwindfish(at)gmail(dot)com> wrote:
>
>
> On Fri, Aug 12, 2022 at 2:50 PM Nico Heller <nico(dot)heller(at)posteo(dot)de> wrote:
>
>> Good day,
>>
>> consider the following query:
>>
>> WITH aggregation(
>> SELECT
>> a.*,
>> (SELECT array_agg(b.*) FROM b WHERE b.a_id = a.id) as "bs",
>> (SELECT array_agg(c.*) FROM c WHERE c.a_id = a.id) as "cs",
>> (SELECT array_agg(d.*) FROM d WHERE d.a_id = a.id) as "ds",
>> (SELECT array_agg(e.*) FROM d WHERE e.a_id = a.id) as "es"
>> FROM a WHERE a.id IN (<some big list, ranging from 20-180 entries)
>> )
>> SELECT to_jsonb(aggregation.*) as "value" FROM aggregation;
>>
>>
> - You do have an index on `b.a_id` and `c.a_id`, etc... ? You didn't
> say...
>
> - Are you sure it is the `to_jsonb` that is making this query slow?
>
> - Since you are serializing this for easy machine readable consumption
> outside of the database, does it make a difference if you use `to_json`
> instead?
>
>
To follow up here a little. I ran some quick tests on my database and
found that `to_json` is consistently, slightly, faster than `to_jsonb` when
you are just serializing the result set for consumption. I feed in some
arrays of 1,000,000 elements for testing. While both json serializers are
slower than just sending back the result set, it wasn't significant on my
machine with simple object types. (3% slower).
Are any of your objects in "b.*", etc, complex data structures or deeper
arrays, or gis shapes, or strange data types that might be hard to
serialize? I'm wondering if there is something hidden in those ".*" row
sets that are particularly problematic and compute intensive to process.
From | Date | Subject | |
---|---|---|---|
Next Message | Kevin McKibbin | 2022-08-21 02:08:47 | pgbench: could not connect to server: Resource temporarily unavailable |
Previous Message | Nico Heller | 2022-08-12 19:48:07 | Re: to_jsonb performance on array aggregated correlated subqueries |