Re: SQL/JSON revisited

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, e(dot)indrupskaya(at)postgrespro(dot)ru
Subject: Re: SQL/JSON revisited
Date: 2023-02-28 11:36:33
Message-ID: CA+HiwqF3_MRaPUOMaH2NJowxSnXu2=5vexrUTQ83EgWspy95LQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Feb 27, 2023 at 4:45 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
> On Tue, Feb 21, 2023 at 2:25 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> > Uh, so this continues to do recursive expression evaluation, as
> > ExecEvalJsonExpr()->JsonPathQuery()->executeJsonPath(EvalJsonPathVar)
> >
> > I'm getting grumpy here. This is wrong, has been pointed out many times. The
> > only thing that changes is that the point of recursion is moved around.
>
> Actually, these JSON path vars, along with other sub-expressions of
> JsonExpr, *are* computed non-recursively as ExprEvalSteps of the
> JsonExprState, at least in the cases where the vars are to be computed
> as part of evaluating the JsonExpr itself. So, the code path you've
> shown above perhaps as a hypothetical doesn't really exist, though
> there *is* an instance where these path vars are computed *outside*
> the context of evaluating the parent JsonExpr, such as in
> JsonTableResetContextItem(). Maybe there's a cleaner way of doing
> that though...
>
> > > + * JsonTableInitOpaque
> > > + * Fill in TableFuncScanState->opaque for JsonTable processor
> > > + */
> > > +static void
> > > +JsonTableInitOpaque(TableFuncScanState *state, int natts)
> > > +{
> > > + JsonTableContext *cxt;
> > > + PlanState *ps = &state->ss.ps;
> > > + TableFuncScan *tfs = castNode(TableFuncScan, ps->plan);
> > > + TableFunc *tf = tfs->tablefunc;
> > > + JsonExpr *ci = castNode(JsonExpr, tf->docexpr);
> > > + JsonTableParent *root = castNode(JsonTableParent, tf->plan);
> > > + List *args = NIL;
> > > + ListCell *lc;
> > > + int i;
> > > +
> > > + cxt = palloc0(sizeof(JsonTableContext));
> > > + cxt->magic = JSON_TABLE_CONTEXT_MAGIC;
> > > +
> > > + if (ci->passing_values)
> > > + {
> > > + ListCell *exprlc;
> > > + ListCell *namelc;
> > > +
> > > + forboth(exprlc, ci->passing_values,
> > > + namelc, ci->passing_names)
> > > + {
> > > + Expr *expr = (Expr *) lfirst(exprlc);
> > > + String *name = lfirst_node(String, namelc);
> > > + JsonPathVariableEvalContext *var = palloc(sizeof(*var));
> > > +
> > > + var->name = pstrdup(name->sval);
> > > + var->typid = exprType((Node *) expr);
> > > + var->typmod = exprTypmod((Node *) expr);
> > > + var->estate = ExecInitExpr(expr, ps);
> > > + var->econtext = ps->ps_ExprContext;
> > > + var->mcxt = CurrentMemoryContext;
> > > + var->evaluated = false;
> > > + var->value = (Datum) 0;
> > > + var->isnull = true;
> > > +
> > > + args = lappend(args, var);
> > > + }
> > > + }
> > > +
> > > + cxt->colexprs = palloc(sizeof(*cxt->colexprs) *
> > > + list_length(tf->colvalexprs));
> > > +
> > > + JsonTableInitScanState(cxt, &cxt->root, root, NULL, args,
> > > + CurrentMemoryContext);
> > > +
> > > + i = 0;
> > > +
> > > + foreach(lc, tf->colvalexprs)
> > > + {
> > > + Expr *expr = lfirst(lc);
> > > +
> > > + cxt->colexprs[i].expr =
> > > + ExecInitExprWithCaseValue(expr, ps,
> > > + &cxt->colexprs[i].scan->current,
> > > + &cxt->colexprs[i].scan->currentIsNull);
> > > +
> > > + i++;
> > > + }
> > > +
> > > + state->opaque = cxt;
> > > +}
> >
> > Why don't you just emit the proper expression directly, insted of the
> > CaseTestExpr stuff, that you then separately evaluate?
>
> I suppose you mean emitting the expression that supplies the value
> given by scan->current and scan->currentIsNull into the same ExprState
> that holds the steps for a given colvalexpr. If so, I don't really
> see a way of doing that given the current model of JSON_TABLE
> execution. The former is computed as part of
> TableFuncRoutine.FetchRow(scan), which sets scan.current (and
> currentIsNull) and the letter is computer as part of
> TableFuncRoutine.GetValue(scan, colnum).

I looked around for another way to pass the value of evaluating one
expression (JsonTableParent.path) as input to the evaluation of
another (an expression in TableFunc.colvalexprs). The only thing that
came to mind is to use PARAM_EXEC parameters instead of CaseTestExpr
placeholders, though I'm not sure whether that is simpler or whether
that would really make things better?

> > Evaluating N expressions for a json table isn't a good approach, both memory
> > and CPU efficiency wise.
>
> Are you referring to JsonTableInitOpaque() initializing all these
> sub-expressions of JsonTableParent, especially colvalexprs, using N
> *independent* ExprStates? That could perhaps be made to work by
> making JsonTableParent be an expression recognized by execExpr.c, so
> that a single ExprState can store the steps for all its
> sub-expressions, much like JsonExpr is. I'll give that a try, though
> I wonder if the semantics of making this work in a single
> ExecEvalExpr() call will mismatch that of the current way, because
> different sub-expressions are currently evaluated under different APIs
> of TableFuncRoutine.

Hmm, the idea to turn JSON_TABLE into a single expression turned out
to be a non-starter after all, because, unlike JsonExpr, it can
produce multiple values. So there must be an ExprState for computing
each column of its output rows. As I mentioned in my other reply,
TableFuncScanState has a list of ExprStates anyway for
TableFunc.colexprs. What we could do is move the ExprStates of
TableFunc.colvalexprs into TableFuncScanState instead of making that
part of the JSON_TABLE opaque state, as I've done in the attached
updated patch.

I also found a way to not require ExecInitExprWithCaseValue() for the
initialization of those expressions by moving the responsibility of
passing the value of CaseTestExpr placeholder contained in those
expressions to the time of evaluating the expressions rather than
initialization time.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v7-0005-RETURNING-clause-for-JSON-and-JSON_SCALAR.patch application/octet-stream 10.2 KB
v7-0006-JSON_TABLE.patch application/octet-stream 100.6 KB
v7-0009-Claim-SQL-standard-compliance-for-SQL-JSON-featur.patch application/octet-stream 2.8 KB
v7-0007-PLAN-clauses-for-JSON_TABLE.patch application/octet-stream 70.9 KB
v7-0008-Documentation-for-SQL-JSON-features.patch application/octet-stream 48.2 KB
v7-0004-SQL-JSON-functions.patch application/octet-stream 51.9 KB
v7-0003-SQL-JSON-query-functions.patch application/octet-stream 195.5 KB
v7-0001-SQL-JSON-constructors.patch application/octet-stream 157.4 KB
v7-0002-IS-JSON-predicate.patch application/octet-stream 42.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2023-02-28 11:55:17 Re: WIN32 pg_import_system_collations
Previous Message vignesh C 2023-02-28 10:59:22 Re: Allow logical replication to copy tables in binary format