Re: jsonpath

From: Oleg Bartunov <obartunov(at)gmail(dot)com>
To: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
Cc: Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: jsonpath
Date: 2018-02-26 19:36:28
Message-ID: CAF4Au4yVXeRrLUE0v+dyp5VvhLNcKkBbc3MEowkFpOQ+Eff4_Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 26, 2018 at 6:34 PM, Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru> wrote:
> Attached 10th version of the jsonpath patches.
>
> 1. Fixed error handling in arithmetic operators.
>
> Now run-time errors in arithmetic operators are catched (added
> PG_TRY/PG_CATCH around operator's functions calls) and converted into
> Unknown values in predicates as it is required by the standard:
>
> =# SELECT jsonb '[1,0,2]' @* '$[*] ? (1 / @ > 0)';
> ?column?
> ----------
> 1
> 2
> (2 rows)
>
> 2. Fixed grammar for parenthesized expressions.
>
> 3. Refactored GIN support for jsonpath operators.
>
> 4. Added one more operator json[b] @# jsonpath returning singleton json[b]
> with
> automatic conditional wrapping of sequences with more than one element
> into
> arrays:
>
> =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 2)';
> ?column?
> -----------
> [3, 4, 5]
> (1 row)
>
> =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 4)';
> ?column?
> ----------
> 5
> (1 row)
>
> =# SELECT jsonb '[1,2,3,4,5]' @# '$[*] ? (@ > 5)';
> ?column?
> ----------
> (null)
> (1 row)
>
> Existing set-returning operator json[b] @* jsonpath is also very userful
> but
> can't be used in functional indices like new operator @#.
>
> Note that conditional wrapping of @# differs from the wrapping in
> JSON_QUERY(... WITH [ARRAY] WRAPPER), where only singleton objects and
> arrays are not wrapped. Unconditional wrapping can be emulated with our
> array construction feature (see below).
>
> 5. Changed time zone behavior in .datetime() item method.
>
> In the previous version of the patch timestamptz SQL/JSON items were
> serialized into JSON string items using session time zone. This behavior
> did not allow jsonpath operators to be marked as immutable, and therefore
> they could not be used in functional indices. Also, when the time zone
> was
> not specified in the input string, but TZM or TZH format fields were
> present
> in the format string, session time zone was used as a default for
> timestamptz items.
>
> To make jsonpath operators immutable we decided to save input time zone
> for
> timestamptz items and disallow automatic time zone assignment. Also
> additional parameter was added to .datetime() for default time zone
> specification:
>
> =# SET timezone = '+03';
> SET
>
> =# SELECT jsonb '"10-03-2017 12:34:56"' @*
> '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
> ERROR: Invalid argument for SQL/JSON datetime function
>
> =# SELECT jsonb '"10-03-2017 12:34:56"' @*
> '$.datetime("DD-MM-YYYY HH24:MI:SS TZH", "+05")';
> ?column?
> -----------------------------
> "2017-03-10T12:34:56+05:00"
> (1 row)
>
> =# SELECT jsonb '"10-03-2017 12:34:56 +05"' @*
> '$.datetime("DD-MM-YYYY HH24:MI:SS TZH")';
> ?column?
> -----------------------------
> "2017-03-10T12:34:56+05:00"
> (1 row)
>
> Please note that our .datetime() behavior is not standard now: by the
> standard, input and format strings must match exactly, i.e. they both
> should
> not contain trailing unmatched elements, so automatic time zone
> assignment
> is impossible. But it too restrictive for PostgreSQL users, so we
> decided
> to preserve usual PostgreSQL behavior here:
>
> =# SELECT jsonb '"10-03-2017"' @* '$.datetime("DD-MM-YYYY HH24:MI:SS")';
> ?column?
> -----------------------
> "2017-03-10T00:00:00"
> (1 row)

I think someday we should consider adding support for sql standard
conforming datetime. Since it
breaks postgres behaviour we will need 'standard_conforming_datetime' guc.

>
>
> Also PostgreSQL is able to automatically recognize format of the input
> string for the specified datetime type, but we can only bring this
> behavior
> into jsonpath by introducing separate item methods .date(), .time(),
> .timetz(), .timestamp() and .timestamptz(). Also we can use here our
> unfinished feature that gives us ability to work with PostresSQL types in
> jsonpath using cast operator :: (see sqljson_ext branch in our github
> repo):
>
> =# SELECT jsonb '"10/03/2017 12:34"' @* '$::timestamptz';
> ?column?
> -----------------------------
> "2017-03-10T12:34:00+03:00"
> (1 row)

Another note.
We decided to preserve TZ in JSON_QUERY function and follow standard
Postgres behaviour in JSON_VALUE,
since JSON_QUERY returns JSON object and JSON_VALUE returns SQL value.

SELECT JSON_QUERY(jsonb '"2018-02-21 17:01:23 +05"',
'$.datetime("YYYY-MM-DD HH24:MI:SS TZH")');
json_query
-----------------------------
"2018-02-21T17:01:23+05:00"
(1 row)

show timezone;
TimeZone
----------
W-SU
(1 row)

SELECT JSON_VALUE(jsonb '"2018-02-21 17:01:23 +05"',
'$.datetime("YYYY-MM-DD HH24:MI:SS TZH")');
json_value
------------------------
2018-02-21 15:01:23+03
(1 row)
>
>
>
> A brief description of the extra jsonpath syntax features contained in the
> patch #7:
>
> * Sequence construction by joining path expressions with comma:
>
> =# SELECT jsonb '[1, 2, 3]' @* '$[*], 4, 5';
> ?column?
> ----------
> 1
> 2
> 3
> 4
> 5
> (5 rows)
>
> * Array construction by placing sequence into brackets (equivalent to
> JSON_QUERY(... WITH UNCONDITIONAL WRAPPER)):
>
> =# SELECT jsonb '[1, 2, 3]' @* '[$[*], 4, 5]';
> ?column?
> -----------------
> [1, 2, 3, 4, 5]
> (1 row)
>
> * Object construction by placing sequences of key-value pairs into braces:
>
> =# SELECT jsonb '{"a" : [1, 2, 3]}' @* '{a: [$.a[*], 4, 5], "b c":
> "dddd"}';
> ?column?
> ---------------------------------------
> {"a": [1, 2, 3, 4, 5], "b c": "dddd"}
> (1 row)
>
> * Object subscripting with string-valued expressions:
>
> =# SELECT jsonb '{"a" : "aaa", "b": "a", "c": "ccc"}' @* '$[$.b, "c"]';
> ?column?
> ----------
> "aaa"
> "ccc"
> (2 rows)
>
> * Support of UNIX epoch time in .datetime() item method:
>
> =# SELECT jsonb '1519649957.37' @* '$.datetime()';
> ?column?
> --------------------------------
> "2018-02-26T12:59:17.37+00:00"
> (1 row)
>

Documentation in user-friendly format (it will be convered to xml, of
course) is available
https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
We are permanently working on it.

>
> --
> Nikita Glukhov
> Postgres Professional:http://www.postgrespro.com
> The Russian Postgres Company
>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Chapman Flack 2018-02-26 19:40:06 Re: Precision loss casting float to numeric
Previous Message Tom Lane 2018-02-26 19:03:56 Re: 2018-03 CFM