Re: jsonpath

From: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>
To: Stas Kelvich <s(dot)kelvich(at)postgrespro(dot)ru>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Thomas Munro <thomas(dot)munro(at)enterprisedb(dot)com>, David Steele <david(at)pgmasters(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(dot)dunstan(at)2ndquadrant(dot)com>, Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>, Robert Haas <robertmhaas(at)gmail(dot)com>, Oleg Bartunov <obartunov(at)gmail(dot)com>, Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
Subject: Re: jsonpath
Date: 2018-09-07 23:21:27
Message-ID: 715c1c44-9bb9-6b9f-7c59-61b66578f107@postgrespro.ru
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Attached 18th version of the patches rebased onto the current master.

On 04.09.2018 17:15, Stas Kelvich wrote:

>> On 23 Aug 2018, at 00:29, Nikita Glukhov<n(dot)gluhov(at)postgrespro(dot)ru> wrote:
>>
>> Attached 17th version of the patches rebased onto the current master.
>>
>> Nothing significant has changed.
> Hey.
>
> I’ve looked through presented implementation of jsonpath and have some remarks:

Thank you for your review.

> 1. Current patch set adds functions named jsonpath_* that implementing subset of functionality
> of corresponding json_* functions which doesn’t require changes to postgres SQL grammar.
> If we are going to commit json_* functions it will be quite strange for end user: two sets of functions
> with mostly identical behaviour, except that one of them allows customise error handling and pass variable via
> special language constructions. I suggest only leave operators in this patch, but remove functions.

We can't simply remove these functions because they are the implementation
of operators.

> 2. Handling of Unknown value differs from one that described in tech report: filter '(@.k == 1)’
> doesn’t evaluates to Unknown. Consider following example:
>
> select '42'::json @* '$ ? ( (@.k == 1) is unknown )';
>
> As per my understanding of standard this should show 42. Seems that it was evaluated to False instead,
> because boolean operators (! || &&) on filter expression with structural error behave like it is False.

Yes, this example should return 42, but only in the strict mode. In the lax
mode, which is default, structural errors are ignored, so '@.k' evaluates to
an empty sequence, and '@.k == 1' then returns False.

> 3. .keyvalue() returns object with field named ‘key’ instead of ’name’
> as per tech report. ‘key’ field seems to be more consistent with function
> name, but i’m not sure it is worths of mismatch with standard. Also ‘id’
> field is omitted, making it harder to use something like GROUP BY afterwards.

SQL/JSON standard uses "key" as a field name (9.39, page 716):
vi) For all h, 1 <= h <= q, let OBJh be an SQL/JSON object with three members:
1) The first member has key "key" and bound value Ki .
2) The second member has key "value”"and bound value BVi.
  3) The third member has key "id" and bound value IDj.

But "id" field was really missing in our implementation. I have added it
using byte offset of the object in jsonb as its identifier.

> 4. Looks like jsonpath executor lacks some CHECK_FOR_INTERRUPTS() during hot paths. Backend with
> following query is unresponsive to signals:
>
> select count(*) from (
> select '[0,1,2,3,4,5,6,7,8,9]'::json @* longpath::jsonpath from (
> select '$[' || string_agg(subscripts, ',') ||']' as longpath from (
> select 'last,1' as subscripts from generate_series(1,1000000)
> ) subscripts_q
> ) jpath_q
> ) count_q;

Fixed: added CHECK_FOR_INTERRUPTS() to jsonpath parser and executor.

> 5. Files generated by lex/bison should be listed in .gitignore files in corresponding directories.

Fixed.

> 6. My compiler complains about unused functions: JsonValueListConcat, JsonValueListClear.

Fixed: these functions used only in the next SQL/JSON patches were removed.

> 7. Presented patch files structure is somewhat complicated with patches to patches. I've melded them
> down to following patches:
>
> 0001: three first patches with preliminary datetime infrastructure
> 0002: Jsonpath engine and operators that is your previous 4+6+7
> 0003: Jsonpath extensions is your previous 8+9
> 0004: GIN support is your 5th path
>
> Also this patches were formed with 'git format-patch', so one can apply all of them with 'git apply'
> restoring each one as commit.

New patch set is formed with 'git format-patch', but I can still supply patches
in the previous split form.

> 8. Also sending jsonpath_check.sql with queries which I used to check compliance with the standard.
> That can be added to regression test, if you want.

I think these additional tests can be added, but I have not yet done it in this
version of the patches.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachment Content-Type Size
0001-Preliminary-datetime-infrastructure-v18.patch text/x-patch 15.3 KB
0002-Jsonpath-engine-and-operators-v18.patch text/x-patch 343.5 KB
0003-Jsonpath-syntax-extensions-v18.patch text/x-patch 46.2 KB
0004-Jsonpath-GIN-support-v18.patch text/x-patch 42.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2018-09-08 00:10:52 Re: *_to_xml() should copy SPI_processed/SPI_tuptable
Previous Message Tom Lane 2018-09-07 23:05:39 Re: unaccent(text) fails depending on search_path (WAS: pg_upgrade fails saying function unaccent(text) doesn't exist)