json/jsonb cleanup + FmgrInfo caching

From: Robert Haas <robertmhaas(at)gmail(dot)com>
To: "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: json/jsonb cleanup + FmgrInfo caching
Date: 2026-07-02 16:25:23
Message-ID: CA+TgmoaiothgQrw9OtgsMzBUCnqJ2jdaGTbS6o3fkjCd+LfzWw@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While working on my "sandboxing untrusted code" project, I found
myself investigating how the JSON and JSONB code calls type output and
cast functions. I feel like this needs some cleanup in order to avoid
blocking that project, and it turns out that there are also
significant opportunities to improve performance, so here are some
patches. One caveat: one of these patches causes a small backward
compatibility break, because the current behavior is wrong. See below
for full details.

First, a quick performance demonstration:

CREATE TABLE hstores AS SELECT hstore('k', g::text) x FROM
generate_series(1,4000000) g;
SELECT any_value(json_build_object('a', x)) FROM hstores;

Unpatched, median-of-three runs was 398.571 ms. Patched, 193.714 ms.
That's a 2.05x speedup, which is the highest I observed in my test
queries. What I found is that this technique shows the largest gains
with user-defined types like hstore, smaller gains with built-in types
like int, and the smallest gains of all with container types such as a
user-defined record type. Generally, jsonb benefited more than json.
The aggregates - json_agg and jsonb_agg - showed very little benefit
or even small regressions, but I'm pretty confident that the
regressions are simply noise that goes away with a sufficiently large
number of sufficiently-careful test runs. Everything else gets faster,
often by 50%+. Leaving aside the aggregates which are expected to show
little or no benefit, here's one of the less-sympathetic cases:

CREATE TABLE ints AS SELECT x FROM generate_series(1,4000000) x;
SELECT any_value(to_json(x)) FROM ints;

The speedup is smaller here because it's json rather than jsonb and
because it's int rather than hstore, but it's still 117.461 ms
unpatched vs. 89.895 patched, a 30% speedup.

OK, now let's go through the patches:

0001 refactors the json_categorize_type() function to initialize an
FmgrInfo instead of returning a base function OID. All of the built-in
JSON aggregates are updated to cache this FmgrInfo across calls, but
it really doesn't save much. In the process of working on this
refactoring it came to light that the current behavior of
json_check_mutability() is incorrect: it erroneously treats record,
anyarray, and anycompatiblearray as immutable when in fact they should
be treated as stable. This refactoring preserves that incorrect
behavior.

0002 fixes the bug discovered during the development of 0001 by
removing the special case. AFAICT, the anyarray and anycompatiblearray
cases are unreachable, but the record case is reachable, and the
included test case shows how this could hypothetically matter. It
seems unlikely we'll inconvenience any significant number of users by
changing this, but in theory somebody's upgrade could fail.

0003 moves some code around to avoid problems with circular header
dependencies, creating new files jsontypes.c/h.

0004 refactors the SQL-level JSON constructors -- JSON_OBJECT,
JSON_ARRAY, and JSON_SCALAR -- to make use of the new type caching
infrastructure.

0005 refactors the SQL-callable functions similarly. This means
to_json(b), json(b)_build_object, and json(b)_build_array.

Thanks,

--
Robert Haas
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v1-0003-Create-jsontypes.c-h-and-use-that-to-tidy-up-a-fe.patch application/octet-stream 18.7 KB
v1-0005-Cache-JSON-type-information-in-various-SQL-callab.patch application/octet-stream 12.6 KB
v1-0002-Don-t-treat-record-json-b-conversions-as-immutabl.patch application/octet-stream 3.0 KB
v1-0004-Allow-JSON_OBJECT-JSON_ARRAY-JSON_SCALAR-to-cache.patch application/octet-stream 18.1 KB
v1-0001-Refactor-json_categorize_type-to-populate-an-Fmgr.patch application/octet-stream 39.8 KB

Browse pgsql-hackers by date

  From Date Subject
Next Message Bryan Green 2026-07-02 16:30:39 pg_dump: use threads for parallel workers on all platforms
Previous Message Fujii Masao 2026-07-02 15:49:24 Re: Truncate logs by max_log_size