Re: pgsql: Implement jsonpath .datetime() method

From: Alexander Korotkov <a(dot)korotkov(at)postgrespro(dot)ru>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Nikita Glukhov <n(dot)gluhov(at)postgrespro(dot)ru>, pgsql-committers <pgsql-committers(at)lists(dot)postgresql(dot)org>, Alexander Korotkov <akorotkov(at)postgresql(dot)org>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pgsql: Implement jsonpath .datetime() method
Date: 2019-10-01 17:41:43
Message-ID: CAPpHfdvbxyk2_ODMomMWKaUg1EZdJbo3OrK2A-WqSoSwvvm4Wg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

On Mon, Sep 30, 2019 at 10:56 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Sun, Sep 29, 2019 at 10:30 AM Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > So, jsonpath behaves like 1000000 is not greater than 2020. This
> > looks like plain false. And user can't expect that unless she is
> > familiar with our particular issues. Now I got opinion that such
> > errors shouldn't be suppressed. We can't suppress *every* error. If
> > trying to do this, we can come to an idea to suppress OOM error and
> > return garbage then, which is obviously ridiculous. Opinions?
>
> I don't know enough about jsonpath to have a view on specifically
> which errors ought to be suppressed, but I agree that it's probably
> not all of them. In fact, I'd go so far as to say that thinking about
> it in terms of error suppression is probably not the right approach in
> the first place. Rather, you want to ask what behavior you're trying
> to create.
>
> For example, if I'm trying to write a function that takes a string as
> input and returns JSON, where the result is formatted as a number if
> possible or a string otherwise, I might want access at the C level to
> the guts of numeric_in, with all parsing errors returned rather than
> thrown. But it would be silly to suppress an out-of-memory condition,
> because that doesn't help the caller. The caller wants to know whether
> the thing can be parsed as a number or not, and that has nothing to do
> with whether we're out of memory, so an out-of-memory error should
> still be thrown.
>
> In this case here, it seems to me that you should similarly start by
> defining the behavior you're trying to create. Unless that's clearly
> defined, deciding which errors to suppress may be difficult.

Making C functions return errors rather than throw is what we're
implementing in our patchsets. In big picture the behavior we're
trying to create is SQL Standard 2016. It defines error handling as
following.

> The SQL operators JSON_VALUE, JSON_QUERY, JSON_TABLE, and JSON_EXISTS provide
> the following mechanisms to handle these errors:
> 1) The SQL/JSON path language traps any errors that occur during the evaluation
> of a <JSON filter expression>. Depending on the precise <JSON path predicate>
> contained in the <JSON filter expression>, the result may be Unknown, True, or
> False, depending on the outcome of non-error tests evaluated in the <JSON path
> predicate>.
> 2) The SQL/JSON path language has two modes, strict and lax, which govern
> structural errors, as follows:
> a) In lax mode:
> i) If an operation requires an SQL/JSON array but the operand is not an SQL
> JSON array, then the operand is first “wrapped” in an SQL/JSON array prior
> to performing the operation.
> ii) If an operation requires something other than an SQL/JSON array, but
> the operand is an SQL/JSON array, then the operand is “unwrapped” by
> converting its elements into an SQL/JSON sequence prior to performing the
> operation.
> iii) After applying the preceding resolutions to structural errors, if
> there is still a structural error, the result is an empty SQL/JSON
> sequence.
> b) In strict mode, if the structural error occurs within a <JSON filter
> expression>, then the error handling of <JSON filter expression> applies
> Otherwise, a structural error is an unhandled error.
> 3) Non-structural errors outside of a <JSON path predicate> are always
> unhandled errors, resulting in an exception condition returned from the path
> engine to the SQL/JSON query operator.
> 4) The SQL/JSON query operators provide an ON ERROR clause to specify the
> behavior in case of an input conversion error, an unhandled structural error,
> an unhandled non-structural error, or an output conversion error.

So, basically standard requires us to suppress any error happening in
filter expression. But as I wrote before suppression of errors in
datetime comparison may lead to surprising results. That happens in
rare corner cases, but still. This makes uneasy choice between
consistent behavior and standard behavior.

However, Nikita Glukhov gave to good idea about that. Instead on
thinking about whether we should suppress or not cast errors in
datetime comparison, we may just eliminate those error. So, if we
know that casting date to timestamp overflows upper bound of finite
timestamp, then we also know that this date is greater than any finite
timestamp. So, we still able to do correct comparison. I'm going to
implement this and post a patch.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

On Mon, Sep 30, 2019 at 10:56 PM Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
> On Sun, Sep 29, 2019 at 10:30 AM Alexander Korotkov
> <a(dot)korotkov(at)postgrespro(dot)ru> wrote:
> > So, jsonpath behaves like 1000000 is not greater than 2020. This
> > looks like plain false. And user can't expect that unless she is
> > familiar with our particular issues. Now I got opinion that such
> > errors shouldn't be suppressed. We can't suppress *every* error. If
> > trying to do this, we can come to an idea to suppress OOM error and
> > return garbage then, which is obviously ridiculous. Opinions?
>
> I don't know enough about jsonpath to have a view on specifically
> which errors ought to be suppressed, but I agree that it's probably
> not all of them. In fact, I'd go so far as to say that thinking about
> it in terms of error suppression is probably not the right approach in
> the first place. Rather, you want to ask what behavior you're trying
> to create.
>
> For example, if I'm trying to write a function that takes a string as
> input and returns JSON, where the result is formatted as a number if
> possible or a string otherwise, I might want access at the C level to
> the guts of numeric_in, with all parsing errors returned rather than
> thrown. But it would be silly to suppress an out-of-memory condition,
> because that doesn't help the caller. The caller wants to know whether
> the thing can be parsed as a number or not, and that has nothing to do
> with whether we're out of memory, so an out-of-memory error should
> still be thrown.
>
> In this case here, it seems to me that you should similarly start by
> defining the behavior you're trying to create. Unless that's clearly
> defined, deciding which errors to suppress may be difficult.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Andrew Dunstan 2019-10-01 19:24:21 Re: pgsql: Add hooks for session start and session end, take two
Previous Message Tomas Vondra 2019-10-01 14:56:45 pgsql: Blind attempt to fix pglz_maximum_compressed_size

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2019-10-01 20:36:56 Re: Proposal: Make use of C99 designated initialisers for nulls/values arrays
Previous Message Andres Freund 2019-10-01 16:49:16 Re: Proposal: Make use of C99 designated initialisers for nulls/values arrays