Re: SQL/JSON: documentation

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Chapman Flack <chap(at)anastigmatix(dot)net>, PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Cc: Liudmila Mantrova <l(dot)mantrova(at)postgrespro(dot)ru>, Oleg Bartunov <obartunov(at)gmail(dot)com>
Subject: Re: SQL/JSON: documentation
Date: 2018-06-28 22:45:30
Message-ID: 85ef2322-c466-6ef0-a5af-3d9d8f07d128@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 28.06.2018 05:23, Chapman Flack wrote:

> On 06/27/2018 07:36 PM, Nikita Glukhov wrote:
>
>> Also it can be found in our sqljson repository on sqljson_doc branch:
>> https://github.com/postgrespro/sqljson/tree/sqljson_doc
> Perhaps it is my unfamiliarity, but it seems that on lines 1067–1071,
> the output clause for JSON_VALUE is given support for return types
> json, jsonb, bytea, text, char, varchar, nchar "out of the box".
>
> There are then examples on lines 1123–1135 of returning float, int,
> and date.
>
> Does that mean that the list in 1067–1071 is incomplete, and should
> include additional data types?
>
> Or does it mean that there is more cleverness buried in the
> "must ... have a cast to the specified type" language than I
> first understood?
>
> Does the function support returning some wanted type w, not in the
> out-of-the-box list, such as float, by searching for an intermediate
> type t ∈ {json, jsonb, bytea, text, char, varchar, nchar} such that
> ∃ cast(t as w), then representing the JSON value as t, then casting
> that to w ?
>
> If so, what does it do if more than one t is a candidate?

First, thank you for your interest in SQL/JSON docs.

Standard says only about returning of string (both binary and character),
numeric, boolean and datetime types in JSON_VALUE and only about string
types in JSON_QUERY.

In JSON_VALUE first searched cast from the SQL type corresponding to the
SQL/JSON type of a resulting scalar item to the target RETURNING type.

SQL/JSON type PG SQL type
string => text
number => numeric
boolean => boolean
date => date
time => time
time with tz => timetz
timestamp => timestamp
timestamp with tz => timestamptz

If this cast does not exist then conversion via input/output is tried (this
is our extension). But json and jsonb RETURNING types are exceptional here,
because SQL/JSON items can be converted directly to json[b] without casting.

But we also support returning of arbitrary PG types including arrays, domains
and records in both JSON_VALUE and JSON_QUERY. In JSON_VALUE values of this
types should be represented as serialized JSON strings, because JSON_VALUE
supports only returning of scalar items. The behavior of JSON_QUERY is similar
to the behavior json[b]_populate_record().

Examples:

-- CAST(numeric AS int) is used here
=# SELECT JSON_VALUE('1.8', '$' RETURNING int);
json_value
------------
2
(1 row)

-- CAST(text AS int) is used here
=# SELECT JSON_VALUE('"1"', '$' RETURNING int);
json_value
------------
1
(1 row)

-- CAST(text AS int) is used here
=# SELECT JSON_VALUE('"1.8"', '$' RETURNING int ERROR ON ERROR);
ERROR: invalid input syntax for integer: "1.8"

-- CAST(numeric AS int) is used here
# SELECT JSON_VALUE('"1.8"', '$.double().floor()' RETURNING int);
json_value
------------
1
(1 row)

-- array of points serialized into single JSON string
-- CAST(text AS point[]) is used
=# SELECT JSON_VALUE('"{\"(1,2)\",\"3,4\",NULL}"', '$' RETURNING point[]);
json_value
------------------------
{"(1,2)","(3,4)",NULL}
(1 row)

-- point[] is represented by JSON array of point strings
-- ARRAY[CAST(text AS point)] is used
=# SELECT JSON_QUERY('["(1, 2)", " 3 , 4 ", null]', '$' RETURNING point[]);

json_query
------------------------
{"(1,2)","(3,4)",NULL}
(1 row)

-- JSON object converted into SQL record type
=# SELECT JSON_QUERY('{"relname": "foo", "relnatts" : 5}', '$' RETURNING pg_class);
json_query
----------------------------------------
(foo,,,,,,,,,,,,,,,,5,,,,,,,,,,,,,,,,)
(1 row)

> Line 2081: "A typical path expression has the following structure"
>
> It seems like a "weasel word" to have "typical" in the statement
> of an expression grammar. Is there more to the grammar than is
> given here?

Yes, that expression grammar is incomplete because arithmetic operations
are supported on the top of jsonpath accessor expressions.

Here is nearly complete expression grammar (predicates are not included):

jsonpath ::=
[STRICT | LAX] jsonpath_expression

jsonpath_expression ::=
jsonpath_additive_expression

jsonpath_additive_expression ::=
[ jsonpath_additive_expression { + | - } ]
jsonpath_multiplicative_expression

jsonpath_multiplicative_expression ::=
[ jsonpath_multiplicative_expression { * | / | % } ]
jsonpath_unary_expression

jsonpath_unary_expression ::=
jsonpath_accessor_expression
| { + | - } jsonpath_unary_expression

jsonpath_accessor_expression ::=
jsonpath_primary { jsonpath_accessor }[...]

jsonpath_accessor ::=
  . *
  | . key_name
| . method_name ( jsonpath_expression [, ...] )
| '[' * ']'
| '[' jsonpath_expression [, ...] ']'
| ? ( predicate )

jsonpath_primary ::=
$
 | @
| variable
 | literal
| ( jsonpath_expression )

> Lines 2323 and 2330 ( / and % operators ). Do these behave differently
> for integer than for float operands? If they provide integer operations,
> which results do they produce for negative operands? (A recent minor
> trauma reminded me that C before C99 left that unspecified, but as this
> is a special-purpose language, perhaps there is a chance to avoid
> leaving such details vague. :) For a similar-language example,
> XPath/XQuery specifies that its idiv and mod operators have the
> truncate-quotient-toward-zero semantics, regardless of the signs of
> the operands.

Arithmetic operations in jsonpath are implemented using PG numeric datatype,
which also is used in jsonb for representation of JSON numbers:

=# SELECT jsonb '3' @* '$ / 2';
?column?
--------------------
1.5000000000000000
(1 row)

=# SELECT jsonb '3.4' @* '$ % 2.3';
?column?
----------
1.1
(1 row)

=# SELECT jsonb '-3.4' @* '$ % 2.3';
?column?
----------
-1.1
(1 row)

The same behavior exists in JavaScript, but it seems that ordinary double
type is used there.

> Line 2519, like_regex: What regex dialect is accepted here? The same
> as the PostgreSQL "POSIX regex"? Or some other?

Standard requires XQuery regexes, but we have only POSIX regexes in PostgreSQL
now, so we decided to use the latter.

We will fix all these issues soon.

> This looks like very interesting functionality!
>
> -Chap

You can try this SQL/JSON examples in our web interface:
http://sqlfiddle.postgrespro.ru/#!21/
(please first select "PostgreSQL 11dev+SQL/JSON" in the version
selection field on the top toolbar).

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Gustafsson 2018-06-28 22:57:21 Re: CREATE TABLE .. LIKE .. EXCLUDING documentation
Previous Message Peter Eisentraut 2018-06-28 21:23:11 Re: Listing triggers in partitions (was Re: Remove mention in docs that foreign keys on partitioned tables)