Re: SQL/JSON: functions

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Oleg Bartunov <obartunov(at)postgrespro(dot)ru>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Cc: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Andres Freund <andres(at)anarazel(dot)de>, Michael Paquier <michael(at)paquier(dot)xyz>, Simon Riggs <simon(at)2ndquadrant(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: SQL/JSON: functions
Date: 2018-06-28 00:18:03
Message-ID: e467c355-6fc7-e1b0-44d5-9707eca8f1dd@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

On 15.03.2018 20:04, Nikita Glukhov wrote:
> Attached 13th version of the patches:
>
> * Subtransactions in PG_TRY/CATCH in ExecEvalJsonExpr() were made unconditional,
> regardless of the volatility of expressions.
>
> * PG_TRY/CATCH in ExecEvalExprPassingCaseValue() was removed along with the
> entire function.

Attached 15th version of the patches:
* disabled parallel execution of SQL/JSON query functions when internal
subtransactions are used (if ERROR ON ERROR is not specified)
* added experimental optimization of internal subtransactions (see below)

The new patch #14 is an experimental attempt to reduce overhead of
subtransaction start/commit which can result in 2x-slowdown in the simplest
cases. By the idea of Alexander Korotkov, subtransaction is not really
committed if it has not touched the database and its XID has not been assigned
(DB modification is not expected in type casts functions) and then can be reused
when the next subtransaction is started. So, all rows in JsonExpr can be
executed in the single cached subtransaction. This optimization really helps
to reduce overhead from 100% to 5-10%:

-- without subtransactions
=# EXPLAIN ANALYZE
SELECT JSON_VALUE('true'::jsonb, '$' RETURNING boolean ERROR ON ERROR)
FROM generate_series(1, 10000000) i;
...
Execution Time: 2785.410 ms

-- cached subtransactions
=# EXPLAIN ANALYZE
SELECT JSON_VALUE('true'::jsonb, '$' RETURNING boolean)
FROM generate_series(1, 10000000) i;
...
Execution Time: 2939.363 ms

-- ordinary subtransactions
=# EXPLAIN ANALYZE
SELECT JSON_VALUE('true'::jsonb, '$' RETURNING boolean)
FROM generate_series(1, 10000000) i;
...
Execution Time: 5417.268 ms

But, unfortunately, I don't believe that this patch is completely correct,
mainly because the behavior of subtransaction callbacks (and their expectations
about subtransaction's lifecycle too) seems unpredictable to me.

Even with this optimization, internal subtransactions still have one major
drawback -- they disallow parallel query execution, because background
workers do not support subtransactions now. Example:

=# CREATE TABLE test_parallel_json_value AS
SELECT i::text::jsonb AS js FROM generate_series(1, 5000000) i;
CREATE TABLE

=# EXPLAIN ANALYZE
SELECT sum(JSON_VALUE(js, '$' RETURNING numeric ERROR ON ERROR))
FROM test_parallel_json_value;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
Finalize Aggregate (cost=79723.15..79723.16 rows=1 width=32) (actual time=455.062..455.062 rows=1 loops=1)
-> Gather (cost=79722.93..79723.14 rows=2 width=32) (actual time=455.052..455.055 rows=3 loops=1)
Workers Planned: 2
Workers Launched: 2
-> Partial Aggregate (cost=78722.93..78722.94 rows=1 width=32) (actual time=446.000..446.000 rows=1 loops=3)
-> Parallel Seq Scan on t (cost=0.00..52681.30 rows=2083330 width=18) (actual time=0.023..104.779 rows=1666667 loops=3)
Planning Time: 0.044 ms
Execution Time: 456.460 ms
(8 rows)

=# EXPLAIN ANALYZE
SELECT sum(JSON_VALUE(js, '$' RETURNING numeric))
FROM test_parallel_json_value;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
Aggregate (cost=144347.82..144347.83 rows=1 width=32) (actual time=1381.938..1381.938 rows=1 loops=1)
-> Seq Scan on t (cost=0.00..81847.92 rows=4999992 width=18) (actual time=0.076..309.676 rows=5000000 loops=1)
Planning Time: 0.082 ms
Execution Time: 1384.133 ms
(4 rows)

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0010-add-invisible-coercion-form-v15.patch text/x-patch 6.7 KB
0011-add-function-formats-v15.patch text/x-patch 10.1 KB
0012-sqljson-v15.patch text/x-patch 295.6 KB
0013-sqljson-json-v15.patch text/x-patch 55.8 KB
0014-optimize-sqljson-subtransactions-v15.patch text/x-patch 8.0 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Nikita Glukhov 2018-06-28 00:34:38 Re: SQL/JSON: JSON_TABLE
Previous Message Peter Geoghegan 2018-06-28 00:00:31 Re: [WIP] [B-Tree] Retail IndexTuple deletion