Re: Define jsonpath functions as stable

From: "Jonathan S(dot) Katz" <jkatz(at)postgresql(dot)org>
To: Chapman Flack <chap(at)anastigmatix(dot)net>, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Define jsonpath functions as stable
Date: 2019-09-16 14:55:17
Message-ID: e24d18bf-999d-988f-615e-313ac397d904@postgresql.org
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 7/29/19 8:33 PM, Chapman Flack wrote:
> On 07/29/19 18:27, Alexander Korotkov wrote:
>
>> What do you think about renaming existing operator from like_regex to
>> pg_like_regex? Or introducing special flag indicating that PostgreSQL
>> regex engine is used ('p' for instance)?
>
> Renaming the operator is simple and certainly solves the problem.
>
> I don't have a strong technical argument for or against any of:
>
>
> $.** ? (@ pg_like_regex "O(w|v)" flag "i")
> $.** ? (@ pg_like_regex "O(w|v)")
>
>
> $.** ? (@ like_regex "O(w|v)" pg flag "i")
> $.** ? (@ like_regex "O(w|v)" pg)
>
>
> $.** ? (@ like_regex "O(w|v)" flag "ip")
> $.** ? (@ like_regex "O(w|v)" flag "p")
>
>
> It seems more of an aesthetic judgment (on which I am no particular
> authority).
>
> I think I would be -0.3 on the third approach just because of the need
> to still spell out ' flag "p"' when there is no other flag you want.
>
> I assume the first two approaches would be about equally easy to
> implement, assuming there's a parser that already has an optional
> production for "flag" STRING.
>
> Both of the first two seem pretty safe from colliding with a
> future addition to the standard.
>
> To my aesthetic sense, pg_like_regex feels like "another operator
> to remember" while like_regex ... pg feels like "ok, a slight variant
> on the operator from the spec".
>
> Later on, if a conformant version is added, the grammar might be a bit
> simpler with just one name and an optional pg.
>
> Going with a flag, there is some question of the likelihood of
> the chosen flag letter being usurped by the standard at some point.
>
> I'm leaning toward a flag for now in my own effort to provide the five SQL
> functions (like_regex, occurrences_regex, position_regex, substring_regex,
> and translate_regex), as for the time being it will be as an extension,
> so no custom grammar for me, and I don't really want to make five
> pg_* variant function names, and have that expand to ten function names
> someday if the real ones are implemented. (Hmm, I suppose I could add
> an optional function argument, distinct from flags; that would be
> analogous to adding a pg in the grammar ... avoids overloading the flags,
> avoids renaming the functions.)

Looking at this thread and[1] and the current state of open items[2], a
few thoughts:

It sounds like the easiest path to completion without potentially adding
futures headaches pushing back the release too far would be that, e.g.
these examples:

$.** ? (@ like_regex "O(w|v)" pg flag "i")
$.** ? (@ like_regex "O(w|v)" pg)

If it's using POSIX regexp, I would +1 using "posix" instead of "pg"

That said, from a user standpoint, it's slightly annoying to have to
include that keyword every time, and could potentially mean changing /
testing quite a bit of code once we do support XQuery regexps. Based on
how we currently handle regular expressions, we've already condition
user's to expect a certain behavior, and it would be inconsistent if we
do one thing in one place, and another thing here, so I would like for
us to be cognizant of that.

Reading the XQuery spec that Chapman provided[3], it sounds like there
are some challenges present if we were to try to implement XQuery-based
regexps.

I do agree with Alvaro's comment ("We have an opportunity to do
better")[4], but I think we have to weigh the likelihood of actually
supporting the XQuery behaviors before we add more burden to our users.
Based on what needs to be done, it does not sound like it is any time soon.

My first choice would be to leave it as is. We can make it abundantly
clear that if we make changes in a future version we advise our users on
what actions to take, and counsel on any behavior changes.

My second choice is to have a flag that makes it clear what kind of
regex's are being used, in which case "posix" -- this is abundantly
clearer to the user, but still default, at present, to using "posix"
expressions. If we ever do add the XQuery ones, we can debate whether we
default to the standard at that time, and if we do, we treat it like we
treat other deprecation issues and make abundantly clear what the
behavior is now.

Thanks,

Jonathan

[1]
https://www.postgresql.org/message-id/flat/5CF28EA0.80902%40anastigmatix.net
[2] https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items
[3]
https://wiki.postgresql.org/wiki/PostgreSQL_vs_SQL/XML_Standards#XML_Query_regular_expressions
[4]
https://www.postgresql.org/message-id/20190618154907.GA6049%40alvherre.pgsql

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kuntal Ghosh 2019-09-16 15:08:58 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Stephen Frost 2019-09-16 14:38:17 Re: block-level incremental backup