Re: Q: Escapes in jsonpath Idents

From: Erik Wienhold <ewie(at)ewie(dot)name>
To: "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Cc: Peter Eisentraut <peter(at)eisentraut(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Q: Escapes in jsonpath Idents
Date: 2024-04-24 19:22:03
Message-ID: kayzzqmh2com3uokglxvpnsikykqi64c727tdhmigw3j7uotxd@mbvdxkudjz5o
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2024-04-24 13:52 +0200, David E. Wheeler wrote:
> On Apr 24, 2024, at 05:51, Peter Eisentraut <peter(at)eisentraut(dot)org> wrote:
>
> > A <JSON path identifier> is classified as follows.
> >
> > Case:
> >
> > a) A <JSON path identifier> that is a <dollar sign> is a <JSON
> > path context variable>.
> >
> > b) A <JSON path identifier> that begins with <dollar sign> is a
> > <JSON path named variable>.
> >
> > c) Otherwise, a <JSON path identifier> is a <JSON path key name>.
> >
> > Does this help? I wasn't following all the discussion to see if
> > there is anything wrong with the implementation.

Thanks Peter! But what is the definition of the entire path expression?
Perhaps something like:

<JSON path> ::= <JSON path identifier> { "." <JSON path identifier> }

That would imply that "$.$foo" is a valid path that accesses a variable
member (but I guess the path evaluation is also specified somewhere).

Does it say anything about double-quoted accessors? In table 8.25[1] we
allow member accessor ."$varname" and it says "If the key name matches
some named variable starting with $ or does not meet the JavaScript
rules for an identifier, it must be enclosed in double quotes to make it
a string literal."

What bugs me about this description, after reading it a couple of times,
is that it's not clear what is meant by ."$varname". It could mean two
things: (1) the double-quoting masks $varname in order to not interpret
those characters as a variable or (2) an interpolated string that
resolves $varname and yields a dynamic member accessor.

The current implementation supports (1), i.e., ."$foo" does not refer to
variable foo but the actual property "$foo":

=> select jsonb_path_query('{"$foo":123,"bar":456}', '$."$foo"', '{"foo":"bar"}');
jsonb_path_query
------------------
123
(1 row)

Under case (2) I'd expect that query to return 456 (because $foo
resolves to "bar"). (Similar to how psql would resolve :'foo' to
'bar'.)

Variables already work in array accessors and table 8.25 says that "The
specified index can be an integer, as well as an expression returning a
single numeric value [...]". A variable is such an expression.

=> select jsonb_path_query('[2,3,5]', '$[$i]', '{"i":1}');
jsonb_path_query
------------------
3
(1 row)

So I'd expect a similar behavior for member accessors as well when
seeing ."$varname" in the same table.

> Yes, it does, as it ties the special meaning of the dollar sign to the
> *beginning* of an expression. So it makes sense that this would be an
> error:
>
> david=# select '$.$foo'::jsonpath;
> ERROR: syntax error at or near "$foo" of jsonpath input
> LINE 1: select '$.$foo'::jsonpath;
> ^
> But I’m less sure when a dollar sign is used in the *middle* (or end)
> of a json path identifier:
>
> david=# select '$.xx$foo'::jsonpath;
> ERROR: syntax error at or near "$foo" of jsonpath input
> LINE 1: select '$.xx$foo'::jsonpath;
> ^
> Perhaps that should be valid?

Yes, I think so. That would be case C in the spec excerpt provided by
Peter. So it's just a key name that happens to contain (but not start
with) the dollar sign.

[1] https://www.postgresql.org/docs/current/datatype-json.html#TYPE-JSONPATH-ACCESSORS

--
Erik

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bruce Momjian 2024-04-24 19:31:49 Re: Statistics Import and Export
Previous Message Daniel Gustafsson 2024-04-24 19:19:57 Re: some additional (small) problems with pg_combinebackup and tablespaces