From: | Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org> |
---|---|
To: | Amit Langote <amitlangote09(at)gmail(dot)com> |
Cc: | jian he <jian(dot)universality(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Erik Rijkers <er(at)xs4all(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: remaining sql/json patches |
Date: | 2023-12-06 10:42:29 |
Message-ID: | 202312061042.uxymt2swy5fz@alvherre.pgsql |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On 2023-Dec-05, Amit Langote wrote:
> I've attempted to trim down the JSON_TABLE grammar (0004), but this is
> all I've managed so far. Among other things, I couldn't refactor the
> grammar to do away with the following:
>
> +%nonassoc NESTED
> +%left PATH
To recap, the reason we're arguing about this is that this creates two
new precedence classes, which are higher than everything else. Judging
by the discussios in thread [1], this is not acceptable. Without either
those new classes or the two hacks I describe below, the grammar has the
following shift/reduce conflict:
State 6220
2331 json_table_column_definition: NESTED . path_opt Sconst COLUMNS '(' json_table_column_definition_list ')'
2332 | NESTED . path_opt Sconst AS name COLUMNS '(' json_table_column_definition_list ')'
2636 unreserved_keyword: NESTED .
PATH shift, and go to state 6286
SCONST reduce using rule 2336 (path_opt)
PATH [reduce using rule 2636 (unreserved_keyword)]
$default reduce using rule 2636 (unreserved_keyword)
path_opt go to state 6287
First, while the grammar uses "NESTED path_opt" in the relevant productions, I
noticed that there's no test that uses NESTED without PATH, so if we break that
case, we won't notice. I propose we remove the PATH keyword from one of
the tests in jsonb_sqljson.sql in order to make sure the grammar
continues to work after whatever hacking we do:
diff --git a/src/test/regress/expected/jsonb_sqljson.out b/src/test/regress/expected/jsonb_sqljson.out
index 7e8ae6a696..8fd2385cdc 100644
--- a/src/test/regress/expected/jsonb_sqljson.out
+++ b/src/test/regress/expected/jsonb_sqljson.out
@@ -1548,7 +1548,7 @@ HINT: JSON_TABLE column names must be distinct from one another.
SELECT * FROM JSON_TABLE(
jsonb 'null', '$[*]' AS p0
COLUMNS (
- NESTED PATH '$' AS p1 COLUMNS (
+ NESTED '$' AS p1 COLUMNS (
NESTED PATH '$' AS p11 COLUMNS ( foo int ),
NESTED PATH '$' AS p12 COLUMNS ( bar int )
),
diff --git a/src/test/regress/sql/jsonb_sqljson.sql b/src/test/regress/sql/jsonb_sqljson.sql
index ea5db88b40..ea9b4ff8b6 100644
--- a/src/test/regress/sql/jsonb_sqljson.sql
+++ b/src/test/regress/sql/jsonb_sqljson.sql
@@ -617,7 +617,7 @@ SELECT * FROM JSON_TABLE(
SELECT * FROM JSON_TABLE(
jsonb 'null', '$[*]' AS p0
COLUMNS (
- NESTED PATH '$' AS p1 COLUMNS (
+ NESTED '$' AS p1 COLUMNS (
NESTED PATH '$' AS p11 COLUMNS ( foo int ),
NESTED PATH '$' AS p12 COLUMNS ( bar int )
),
Having done that, AFAICS there are two possible fixes for the grammar.
One is to keep the idea of assigning precedence explicitly to these
keywords, but do something less hackish -- we can put NESTED together
with UNBOUNDED, and classify PATH in the IDENT group. This requires no
further changes. This would make NESTED PATH follow the same rationale
as UNBOUNDED FOLLOWING / UNBOUNDED PRECEDING. Here's is a preliminary
patch for that (the large comment above needs to be updated.)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c15fcf2eb2..1493ac7580 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -887,9 +887,9 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
* json_predicate_type_constraint and json_key_uniqueness_constraint_opt
* productions (see comments there).
*/
-%nonassoc UNBOUNDED /* ideally would have same precedence as IDENT */
+%nonassoc UNBOUNDED NESTED /* ideally would have same precedence as IDENT */
%nonassoc IDENT PARTITION RANGE ROWS GROUPS PRECEDING FOLLOWING CUBE ROLLUP
- SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT
+ SET KEYS OBJECT_P SCALAR VALUE_P WITH WITHOUT PATH
%left Op OPERATOR /* multi-character ops and user-defined operators */
%left '+' '-'
%left '*' '/' '%'
@@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
*/
%left JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
-%nonassoc NESTED
-%left PATH
%%
/*
The other thing we can do is use the two-token lookahead trick, by
declaring
%token NESTED_LA
and using the parser.c code to replace NESTED with NESTED_LA when it is
followed by PATH. This doesn't require assigning precedence to
anything. We do need to expand the two rules that have "NESTED
opt_path Sconst" to each be two rules, one for "NESTED_LA PATH Sconst"
and another for "NESTED Sconst". So the opt_path production goes away.
This preliminary patch does that. (I did not touch the ecpg grammar, but
it needs an update too.)
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index c15fcf2eb2..8e4c1d4ebe 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -817,7 +817,7 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
* FORMAT_LA, NULLS_LA, WITH_LA, and WITHOUT_LA are needed to make the grammar
* LALR(1).
*/
-%token FORMAT_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
+%token FORMAT_LA NESTED_LA NOT_LA NULLS_LA WITH_LA WITHOUT_LA
/*
* The grammar likewise thinks these tokens are keywords, but they are never
@@ -911,8 +911,6 @@ static Node *makeRecursiveViewSelect(char *relname, List *aliases, Node *query);
*/
%left JOIN CROSS LEFT FULL RIGHT INNER_P NATURAL
-%nonassoc NESTED
-%left PATH
%%
/*
@@ -16771,7 +16769,7 @@ json_table_column_definition:
n->location = @1;
$$ = (Node *) n;
}
- | NESTED path_opt Sconst
+ | NESTED_LA PATH Sconst
COLUMNS '(' json_table_column_definition_list ')'
{
JsonTableColumn *n = makeNode(JsonTableColumn);
@@ -16783,7 +16781,19 @@ json_table_column_definition:
n->location = @1;
$$ = (Node *) n;
}
- | NESTED path_opt Sconst AS name
+ | NESTED Sconst
+ COLUMNS '(' json_table_column_definition_list ')'
+ {
+ JsonTableColumn *n = makeNode(JsonTableColumn);
+
+ n->coltype = JTC_NESTED;
+ n->pathspec = $2;
+ n->pathname = NULL;
+ n->columns = $5;
+ n->location = @1;
+ $$ = (Node *) n;
+ }
+ | NESTED_LA PATH Sconst AS name
COLUMNS '(' json_table_column_definition_list ')'
{
JsonTableColumn *n = makeNode(JsonTableColumn);
@@ -16795,6 +16805,19 @@ json_table_column_definition:
n->location = @1;
$$ = (Node *) n;
}
+ | NESTED Sconst AS name
+ COLUMNS '(' json_table_column_definition_list ')'
+ {
+ JsonTableColumn *n = makeNode(JsonTableColumn);
+
+ n->coltype = JTC_NESTED;
+ n->pathspec = $2;
+ n->pathname = $4;
+ n->columns = $7;
+ n->location = @1;
+ $$ = (Node *) n;
+ }
+
;
json_table_column_path_specification_clause_opt:
@@ -16802,11 +16825,6 @@ json_table_column_path_specification_clause_opt:
| /* EMPTY */ { $$ = NULL; }
;
-path_opt:
- PATH { }
- | /* EMPTY */ { }
- ;
-
json_table_plan_clause_opt:
PLAN '(' json_table_plan ')' { $$ = $3; }
| PLAN DEFAULT '(' json_table_default_plan_choices ')'
diff --git a/src/backend/parser/parser.c b/src/backend/parser/parser.c
index e17c310cc1..e3092f2c3e 100644
--- a/src/backend/parser/parser.c
+++ b/src/backend/parser/parser.c
@@ -138,6 +138,7 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
switch (cur_token)
{
case FORMAT:
+ case NESTED:
cur_token_length = 6;
break;
case NOT:
@@ -204,6 +205,16 @@ base_yylex(YYSTYPE *lvalp, YYLTYPE *llocp, core_yyscan_t yyscanner)
}
break;
+ case NESTED:
+ /* Replace NESTED by NESTED_LA if it's followed by PATH */
+ switch (next_token)
+ {
+ case PATH:
+ cur_token = NESTED_LA;
+ break;
+ }
+ break;
+
case NOT:
/* Replace NOT by NOT_LA if it's followed by BETWEEN, IN, etc */
switch (next_token)
I don't know which of the two "fixes" is less bad. Like Amit, I was not
able to find a solution to the problem by merely attaching precedences
to rules. (I did not try to mess with the precedence of
unreserved_keyword, because I'm pretty sure that would not be a good
solution even if I could make it work.)
[1] https://postgr.es/m/CADT4RqBPdbsZW7HS1jJP319TMRHs1hzUiP=iRJYR6UqgHCrgNQ@mail.gmail.com
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
From | Date | Subject | |
---|---|---|---|
Next Message | jian he | 2023-12-06 10:47:10 | Re: POC PATCH: copy from ... exceptions to: (was Re: VLDB Features) |
Previous Message | Amit Kapila | 2023-12-06 10:19:29 | Re: logical decoding and replication of sequences, take 2 |