Optimization of some jsonb functions

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Optimization of some jsonb functions
Date: 2019-02-22 00:05:33
Message-ID: 7c417f90-f95f-247e-ba63-d95e39c0ad14@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Attached set of patches with some jsonb optimizations that were made during
comparison of performance of ordinal jsonb operators and jsonpath operators.

1. Optimize JsonbExtractScalar():
It is better to use getIthJsonbValueFromContainer(cont, 0) instead of
JsonIterator to get 0th element of raw-scalar pseudoarray.
JsonbExtractScalar() is used in jsonb casts, so they speed up a bit.

2. Optimize operator #>>, jsonb_each_text(), jsonb_array_elements_text():
These functions have direct conversion (JsonbValue => text) only for
jbvString scalars, but indirect conversion of other types of scalars
(JsonbValue => jsonb => text) is obviously too slow. Extracted common
subroutine JsonbValueAsText() and used in all suitable places.

3. Optimize JsonbContainer type recognition in get_jsonb_path_all():
Fetching of the first token from JsonbIterator is replaced with lightweight
JsonbContainerIsXxx() macros.

4. Extract findJsonbKeyInObject():
Extracted findJsonbKeyInObject() from findJsonbValueFromContainer(),
which is slightly easier to use (key string and its length is passed instead
of filled string JsonbValue).

5. Optimize resulting value allocation in findJsonbValueFromContainer() and
getIthJsonbValueFromContainer():
Added ability to pass stack-allocated JsonbValue that will be filled with
the result of operation instead of returning unconditionally palloc()ated
JsonbValue.

Patches #4 and #5 are mostly refactorings, but they can give small speedup
(up to 5% for upcoming jsonpath operators) due to elimination of unnecessary
palloc()s. The whole interface of findJsonbValueFromContainer() with JB_OBJECT
and JB_ARRAY flags always seemed a bit strange to me, so I think it is worth to
have separate functions for searching keys in objects and elements in arrays.

Performance tests:
- Test data for {"x": {"y": {"z": i}}}:
CREATE TABLE t AS
SELECT jsonb_build_object('x',
jsonb_build_object('y',
jsonb_build_object('z', i))) js
FROM generate_series(1, 3000000) i;

- Sample query:
EXPLAIN (ANALYZE) SELECT js -> 'x' -> 'y' -> 'z' FROM t;

- Results:
| execution time, ms
query | master | optimized
-------------------------------------------------------------------------------
{"x": {"y": {"z": i}}}
js #> '{x,y,z}' | 1148.632 | 1005.578 -10%
js #>> '{x,y,z}' | 1520.160 | 849.991 -40%
(js #> '{x,y,z}')::numeric | 1310.881 | 1067.752 -20%
(js #>> '{x,y,z}')::numeric | 1757.179 | 1109.495 -30%

js -> 'x' -> 'y' -> 'z' | 1030.211 | 977.267
js -> 'x' -> 'y' ->> 'z' | 887.101 | 838.745
(js -> 'x' -> 'y' -> 'z')::numeric | 1184.086 | 1050.462
(js -> 'x' -> 'y' -> 'z')::int4 | 1279.315 | 1133.032
(js -> 'x' -> 'y' ->> 'z')::numeric | 1134.003 | 1100.047
(js -> 'x' -> 'y' ->> 'z')::int4 | 1077.216 | 991.995

js ? 'x' | 523.111 | 495.387
js ?| '{x,y,z}' | 612.880 | 607.455
js ?& '{x,y,z}' | 674.786 | 643.987
js -> 'x' -> 'y' ? 'z' | 712.623 | 698.588
js @> '{"x": {"y": {"z": 1}}}' | 1154.926 | 1149.069

jsonpath:
js @@ '$.x.y.z == 123' | 973,444 | 912,08 -5%

{"x": i, "y": i, "z": i}
jsonb_each(js) | 2281.577 | 2262.660
jsonb_each_text(js) | 2603.539 | 2112.200 -20%

[i, i, i]
jsonb_array_elements(js) | 1255.210 | 1205.939
jsonb_array_elements(js)::numeric | 1662.550 | 1576.227 -5%
jsonb_array_elements_text(js) | 1555.021 | 1067.031 -30%

js @> '1' | 798.858 | 768.664 -4%
js <@ '[1,2,3]' | 820.795 | 785.086 -5%
js <@ '[0,1,2,3,4,5,6,7,8,9]' | 1214.170 | 1165.289 -5%

As it can be seen, #> operators are always slower than equivalent series of ->.
I think it is caused by array deconstruction in "jsonb #> text[]".

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Optimize-JsonbExtractScalar-v01.patch text/x-patch 1.6 KB
0002-Optimize-jsonb_xxx_text-functions-v01.patch text/x-patch 6.2 KB
0003-Optimize-JsonbContainer-type-recognition-v01.patch text/x-patch 1.3 KB
0004-Extract-findJsonbKeyInObject-v01.patch text/x-patch 10.7 KB
0005-Optimize-resulting-JsonbValue-allocation.patch text/x-patch 8.6 KB

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2019-02-22 00:13:13 Re: boolean and bool in documentation
Previous Message Stephen Frost 2019-02-21 23:39:10 Re: [PATCH v20] GSSAPI encryption support