Re: SQL/JSON: JSON_TABLE

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Dmitry Dolgov <9erthalion6(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>
Subject: Re: SQL/JSON: JSON_TABLE
Date: 2019-11-21 16:28:00
Message-ID: 0350fdf8-c44d-46ab-6cc3-d39e77130cb1@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 17.11.2019 13:35, Pavel Stehule wrote:

> Hi
>
> út 12. 11. 2019 v 22:51 odesílatel Nikita Glukhov
> <n(dot)gluhov(at)postgrespro(dot)ru <mailto:n(dot)gluhov(at)postgrespro(dot)ru>> napsal:
>
> On 12.11.2019 20:54, Pavel Stehule wrote:
>
> > Hi
> >
> > please, can you rebase 0001-SQL-JSON-functions-v40.patch. I have a
> > problem with patching
> >
> > Pavel
>
> Attached 41th version of the patches rebased onto current master.
>
>
> I testing functionality - randomly testing some examples that I found
> on internet.
>
Thank you for testing JSON_TABLE.

> I found:
> a) Oracle & MySQL (Oracle) supports EXISTS clause, this implementation not.
> I think should be useful support this clause too.
> SELECT * FROM JSON_TABLE('...', '...' COLUMNS x INT EXISTS PATH ...
>
EXISTS PATH clause can be emulated with jsonpath EXISTS() predicate:

=# SELECT *
FROM JSON_TABLE('{"a": 1}', '$'
COLUMNS (
a bool PATH 'exists($.a)',
b bool PATH 'exists($.b)'
));
a | b
---+---
t | f
(1 row)

But this works as expected only in lax mode. In strict mode EXISTS() returns
Unknown that transformed into SQL NULL:

=# SELECT *
FROM JSON_TABLE('{"a": 1}', '$'
COLUMNS (
a bool PATH 'strict exists($.a)',
b bool PATH 'strict exists($.b)'
));
a | b
---+---
t |
(1 row)

There is no easy way to return false without external COALESCE(),
DEFAULT false ON ERROR also does not help.

So, I think it's worth to add EXISTS PATH clause to our implementation.

> There is a question how to map boolean result to other data types.

Now, boolean result can be used in JSON_TABLE columns of bool, int4, text,
json[b], and other types which have CAST from bool:

SELECT *
FROM JSON_TABLE('{"a": 1}', '$'
COLUMNS (
a int PATH 'exists($.a)',
b text PATH 'exists($.b)'
));
a | b
---+-------
1 | false
(1 row)

> b) When searched value is not scalar, then it returns null. This behave can be
> suppressed by clause FORMAT Json. I found a different behave, and maybe I found
> a bug. On MySQL this clause is by default for JSON values (what has sense).
> SELECT *
> FROM
>       JSON_TABLE(
>         '[{"a":[1,2]}]',
>         '$[*]'
>         COLUMNS(
>          aj JSON PATH '$.a' DEFAULT '{"x": 333}' ON EMPTY
>         )
>       ) AS tt;
> It returns null, although it should to return [1,2].

Yes, regular (non-formatted) JSON_TABLE columns can accept only scalar values.
Otherwise an error is thrown, which can be caught by ON ERROR clause. This
behavior is specified by the standard.

FORMAT JSON is not implicitly added for json[b] columns now. The current SQL
standard does not have any json data types, so I think we can add implicit
FORMAT JSON for json[b] typed-columns. But I'm a bit afraid that different
behavior can be standardized after introduction of json data types in SQL.

> There is another bug maybe. Although there is DEFAULT clause. It returns NULL.

ON ERROR should be used if "not a scalar" error needs to be caught:

SELECT *
FROM
JSON_TABLE(
'[{"a":[1,2]}]',
'$[*]'
COLUMNS(
aj JSON PATH '$.a' DEFAULT '{"x": 333}' ON ERROR
)
) AS tt;

aj
------------
{"x": 333}
(1 row)

ON EMPTY catches only empty-result case (for example, non-existent path in
lax mode):

SELECT *
FROM
JSON_TABLE(
'[{"a":[1,2]}]',
'$[*]'
COLUMNS(
aj JSON PATH '$.foo' DEFAULT '{"x": 333}' ON EMPTY
)
) AS tt;
aj
------------
{"x": 333}
(1 row)

> I got correct result when I used FORMAT JSON clause.
> I think it should be default behave for json and jsonb columns.

I agree that FORMAT JSON could be implicit for json[b] columns. But I think
there could be one minor problem if we want to verify that returned value is
scalar.

Without FORMAT JSON this is verified by the underlying JSON_VALUE expression:

SELECT *
FROM
JSON_TABLE(
'[{"a":[1,2]}]',
'$[*]'
COLUMNS (
aj JSON PATH 'lax $.a' ERROR ON ERROR
)
) AS tt;
ERROR: JSON path expression in JSON_VALUE should return singleton scalar item

(This error message with the reference to implicit JSON_VALUE needs to be fixed.)

But with FORMAT JSON we need to construct complex jsonpath with a filter and
override ON EMPTY behavior:

SELECT *
FROM
JSON_TABLE(
'[{"a":[1,2]}]',
'$[*]'
COLUMNS (
aj JSON FORMAT JSON
-- strict mode is mandatory to prevent array unwrapping
PATH 'strict $.a ? (@.type() != "array" && @.type() != "object")'
ERROR ON EMPTY ERROR ON ERROR
)
) AS tt;
ERROR: no SQL/JSON item

> Another question - when I used FORMAT JSON clause, then I got syntax error
> on DEFAULT keyword .. . Is it correct?
>
> Why I cannot to use together FORMAT JSON and DEFAULT clauses?

JSON_TABLE columns with FORMAT JSON, like JSON_QUERY, can have only
ERROR, NULL, EMPTY ARRAY, EMPTY OBJECT behaviors.

This syntax is specified in the SQL standard:

<JSON table formatted column definition> ::=
<column name> <data type> FORMAT <JSON representation>
[ PATH <JSON table column path specification> ]
[ <JSON table formatted column wrapper behavior> WRAPPER ]
[ <JSON table formatted column quotes behavior> QUOTES [ ON SCALAR STRING ] ]
[ <JSON table formatted column empty behavior> ON EMPTY ]
[ <JSON table formatted column error behavior> ON ERROR ]

<JSON table formatted column empty behavior> ::=
ERROR
| NULL
| EMPTY ARRAY
| EMPTY OBJECT

<JSON table formatted column error behavior> ::=
ERROR
| NULL
| EMPTY ARRAY
| EMPTY OBJECT

But I also think that DEFAULT clause could be very useful in JSON_QUERY and
formatted JSON_TABLE columns.

> Note - this behave is not described in documentation.

There are references to JSON_QUERY and JSON_VALUE behavior in the definitions
of JSON_TABLE columns, but their behavior still seems to be unclear. This
needs to be fixed.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2019-11-21 16:51:39 Re: SQL/JSON: JSON_TABLE
Previous Message Andy Fan 2019-11-21 15:57:22 Re: why doesn't optimizer can pull up where a > ( ... )