Re: SQL:2023 JSON simplified accessor support

From: Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>
To: Alexandra Wang <alexandra(dot)wang(dot)oss(at)gmail(dot)com>, Nikita Glukhov <glukhov(dot)n(dot)a(at)gmail(dot)com>
Cc: jian he <jian(dot)universality(at)gmail(dot)com>, Nikita Malakhov <hukutoc(at)gmail(dot)com>, Vik Fearing <vik(at)postgresfriends(dot)org>, Mark Dilger <mark(dot)dilger(at)enterprisedb(dot)com>, Matheus Alcantara <matheusssilv97(at)gmail(dot)com>, Peter Eisentraut <peter(at)eisentraut(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>, "David E(dot) Wheeler" <david(at)justatheory(dot)com>
Subject: Re: SQL:2023 JSON simplified accessor support
Date: 2025-09-02 03:32:33
Message-ID: A110B181-A8BA-4271-B9CE-D40CD396949F@gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> On Sep 2, 2025, at 10:59, Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com> wrote:
>
> jsonbsubs doesn’t need to check is_slice flag as well, I will explain that in my next email tougher with my new comments.

Now jsonb_check_json_needed() loops over all indirections:

static bool
jsonb_check_jsonpath_needed(List *indirection)
{
ListCell *lc;

foreach(lc, indirection)
{
Node *accessor = lfirst(lc);

if (IsA(accessor, String))
return true;
else
{
Assert(IsA(accessor, A_Indices));

if (castNode(A_Indices, accessor)->is_slice)
return true;
}
}

return false;
}

Let’s consider this statement: select data[‘a’][‘b’].c from jsonb_table.

For this indirection list, first two elements are non-slice indices, and the third is a string accessor. So json_check_jsonpath_needed() will return true.

Then:

static void
jsonb_subscript_transform(SubscriptingRef *sbsref,
List **indirection,
ParseState *pstate,
bool isSlice,
bool isAssignment)
{
List *upperIndexpr = NIL;
ListCell *idx;

/* Determine the result type of the subscripting operation; always jsonb */
sbsref->refrestype = JSONBOID;
sbsref->reftypmod = -1;

if (isSlice || jsonb_check_jsonpath_needed(*indirection))
{
jsonb_subscript_make_jsonpath(pstate, indirection, sbsref);
if (sbsref->refjsonbpath)
return;
}

It will fall into the call to json_subscript_make_jsonpath(), and it cannot process [‘a’], so sbsref->refjsonbpath will be NULL, then it will fall through down to the logic of processing [‘a’].

That’s actually a waste of looping over the entire indirection list and making an unnecessary call to jsonb_subscript_make_jsonpath().

In json_check_jsonpath_needed(), we can only check the first item. Also, as jsonb_check_jsonpath_needed() already has the logic of checking is_slice, here in jsonb_subscript_transform(), we don’t need to check “isSlice” in the “if” condition at all.

I made the change locally, and “make check” passed.

My dirty change is like:
```
@@ -118,11 +118,11 @@ coerce_jsonpath_subscript_to_int4_or_text(ParseState *pstate, Node *subExpr)
static bool
jsonb_check_jsonpath_needed(List *indirection)
{
- ListCell *lc;
+ //ListCell *lc;

- foreach(lc, indirection)
- {
- Node *accessor = lfirst(lc);
+ //foreach(lc, indirection)
+ //{
+ Node *accessor = lfirst(list_head(indirection));

if (IsA(accessor, String) || IsA(accessor, A_Star))
return true;
@@ -133,7 +133,8 @@ jsonb_check_jsonpath_needed(List *indirection)
if (castNode(A_Indices, accessor)->is_slice)
return true;
}
- }
+ //break;
+ //}

return false;
}

@@ -401,7 +435,7 @@ jsonb_subscript_transform(SubscriptingRef *sbsref,
sbsref->refrestype = JSONBOID;
sbsref->reftypmod = -1;

- if (isSlice || jsonb_check_jsonpath_needed(*indirection))
+ if (jsonb_check_jsonpath_needed(*indirection))
{
jsonb_subscript_make_jsonpath(pstate, indirection, sbsref);
if (sbsref->refjsonbpath)
```

The other comment is that jsonb_subscript_make_jsonpath() can be optimized a little bit. When indices and dot notations are mixed, it will always hit the clause “if (api_from == NULL)”. In this case, memory of jpi has been allocated, and with the “if” we need to free the memory. So, we can pull up calculating jpi_from, if it is NULL, then we don’t need to allocate and free memory. My dirty change is like:

```
else if (IsA(accessor, A_Indices))
{
+ JsonPathParseItem *jpi_from = NULL;
A_Indices *ai = castNode(A_Indices, accessor);

+ if (!ai->is_slice)
+ {
+ Assert(ai->uidx && !ai->lidx);
+ jpi_from = make_jsonpath_item_expr(pstate, ai->uidx, &sbsref->refupperindexpr, true);
+ if (jpi_from == NULL)
+ {
+ /*
+ * Break out of the loop if the subscript is not a
+ * non-null integer constant, so that we can fall back to
+ * jsonb subscripting logic.
+ *
+ * This is needed to handle cases with mixed usage of SQL
+ * standard json simplified accessor syntax and PostgreSQL
+ * jsonb subscripting syntax, e.g:
+ *
+ * select (jb).a['b'].c from jsonb_table;
+ *
+ * where dot-notation (.a and .c) is the SQL standard json
+ * simplified accessor syntax, and the ['b'] subscript is
+ * the PostgreSQL jsonb subscripting syntax, because 'b'
+ * is not a non-null constant integer and cannot be used
+ * for json array access.
+ *
+ * In this case, we cannot create a JsonPath item, so we
+ * break out of the loop and let
+ * jsonb_subscript_transform() handle this indirection as
+ * a PostgreSQL jsonb subscript.
+ */
+ break;
+ }
+ }
+
jpi = make_jsonpath_item(jpiIndexArray);
jpi->value.array.nelems = 1;
jpi->value.array.elems = palloc(sizeof(jpi->value.array.elems[0]));
@@ -303,12 +337,12 @@ jsonb_subscript_make_jsonpath(ParseState *pstate, List **indirection, Subscripti
}
else
{
- JsonPathParseItem *jpi_from = NULL;
+ //JsonPathParseItem *jpi_from = NULL;

- Assert(ai->uidx && !ai->lidx);
- jpi_from = make_jsonpath_item_expr(pstate, ai->uidx, &sbsref->refupperindexpr, true);
- if (jpi_from == NULL)
- {
+ //Assert(ai->uidx && !ai->lidx);
+ //jpi_from = make_jsonpath_item_expr(pstate, ai->uidx, &sbsref->refupperindexpr, true);
+ //if (jpi_from == NULL)
+ //{
/*
* Break out of the loop if the subscript is not a
* non-null integer constant, so that we can fall back to
@@ -331,10 +365,10 @@ jsonb_subscript_make_jsonpath(ParseState *pstate, List **indirection, Subscripti
* jsonb_subscript_transform() handle this indirection as
* a PostgreSQL jsonb subscript.
*/
- pfree(jpi->value.array.elems);
- pfree(jpi);
- break;
- }
+ // pfree(jpi->value.array.elems);
+ // pfree(jpi);
+ // break;
+ //}

jpi->value.array.elems[0].from = jpi_from;
jpi->value.array.elems[0].to = NULL;
```

With this change, “make check” also passed.

The third comment is about test cases. I only see tests for indices following dot notation, like data.a[‘b’], but not the reverse. Let’s add a few test cases for dot notation following indices, like data[‘a’].b.

The last comment is about error message:

```
evantest=# select data.a from test_jsonb_types;
ERROR: missing FROM-clause entry for table "data"
LINE 1: select data.a from test_jsonb_types;
```

“Missing FROM-clause entry” is quite confusing and not providing much useful hint to resolve the problem. When I first saw this error, I was stuck. Only after read through the commit comments, I figured out the problem. For end users, we should provide some more meaningful error messages.

Best regards,
--
Chao Li (Evan)
HighGo Software Co., Ltd.
https://www.highgo.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message David G. Johnston 2025-09-02 03:53:34 Re: SQL:2023 JSON simplified accessor support
Previous Message Chao Li 2025-09-02 02:59:39 Re: SQL:2023 JSON simplified accessor support