Re: remaining sql/json patches

From: Amit Langote <amitlangote09(at)gmail(dot)com>
To: Andres Freund <andres(at)anarazel(dot)de>
Cc: Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Andrew Dunstan <andrew(at)dunslane(dot)net>, jian he <jian(dot)universality(at)gmail(dot)com>, Erik Rijkers <er(at)xs4all(dot)nl>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: remaining sql/json patches
Date: 2023-11-15 13:00:41
Message-ID: CA+HiwqG_XFksg7HN_gbmcOQU00Q5bvTKeLSZ3NftewdBRw4fKA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Sorry for the late reply.

On Sat, Oct 7, 2023 at 6:49 AM Andres Freund <andres(at)anarazel(dot)de> wrote:
> On 2023-09-29 13:57:46 +0900, Amit Langote wrote:
> > +/*
> > + * Push steps to evaluate a JsonExpr and its various subsidiary expressions.
> > + */
> > +static void
> > +ExecInitJsonExpr(JsonExpr *jexpr, ExprState *state,
> > + Datum *resv, bool *resnull,
> > + ExprEvalStep *scratch)
> > +{
> > + JsonExprState *jsestate = palloc0(sizeof(JsonExprState));
> > + JsonExprPreEvalState *pre_eval = &jsestate->pre_eval;
> > + ListCell *argexprlc;
> > + ListCell *argnamelc;
> > + int skip_step_off = -1;
> > + int passing_args_step_off = -1;
> > + int coercion_step_off = -1;
> > + int coercion_finish_step_off = -1;
> > + int behavior_step_off = -1;
> > + int onempty_expr_step_off = -1;
> > + int onempty_jump_step_off = -1;
> > + int onerror_expr_step_off = -1;
> > + int onerror_jump_step_off = -1;
> > + int result_coercion_jump_step_off = -1;
> > + List *adjust_jumps = NIL;
> > + ListCell *lc;
> > + ExprEvalStep *as;
> > +
> > + jsestate->jsexpr = jexpr;
> > +
> > + /*
> > + * Add steps to compute formatted_expr, pathspec, and PASSING arg
> > + * expressions as things that must be evaluated *before* the actual JSON
> > + * path expression.
> > + */
> > + ExecInitExprRec((Expr *) jexpr->formatted_expr, state,
> > + &pre_eval->formatted_expr.value,
> > + &pre_eval->formatted_expr.isnull);
> > + ExecInitExprRec((Expr *) jexpr->path_spec, state,
> > + &pre_eval->pathspec.value,
> > + &pre_eval->pathspec.isnull);
> > +
> > + /*
> > + * Before pushing steps for PASSING args, push a step to decide whether to
> > + * skip evaluating the args and the JSON path expression depending on
> > + * whether either of formatted_expr and pathspec is NULL; see
> > + * ExecEvalJsonExprSkip().
> > + */
> > + scratch->opcode = EEOP_JSONEXPR_SKIP;
> > + scratch->d.jsonexpr_skip.jsestate = jsestate;
> > + skip_step_off = state->steps_len;
> > + ExprEvalPushStep(state, scratch);
>
> Could SKIP be implemented using EEOP_JUMP_IF_NULL with a bit of work? I see
> that it sets jsestate->post_eval.jcstate, but I don't understand why it needs
> to be done that way. /* ExecEvalJsonExprCoercion() depends on this. */ doesn't
> explain that much.

OK, I've managed to make this work using EEOP_JUMP_IF_NULL for each of
the two expressions that need checking: formatted_expr and pathspec.

> > + /* PASSING args. */
> > + jsestate->pre_eval.args = NIL;
> > + passing_args_step_off = state->steps_len;
> > + forboth(argexprlc, jexpr->passing_values,
> > + argnamelc, jexpr->passing_names)
> > + {
> > + Expr *argexpr = (Expr *) lfirst(argexprlc);
> > + String *argname = lfirst_node(String, argnamelc);
> > + JsonPathVariable *var = palloc(sizeof(*var));
> > +
> > + var->name = pstrdup(argname->sval);
>
> Why does this need to be strdup'd?

Seems unnecessary, so removed.

> > + /* Step for the actual JSON path evaluation; see ExecEvalJsonExpr(). */
> > + scratch->opcode = EEOP_JSONEXPR_PATH;
> > + scratch->d.jsonexpr.jsestate = jsestate;
> > + ExprEvalPushStep(state, scratch);
> > +
> > + /*
> > + * Step to handle ON ERROR and ON EMPTY behavior. Also, to handle errors
> > + * that may occur during coercion handling.
> > + *
> > + * See ExecEvalJsonExprBehavior().
> > + */
> > + scratch->opcode = EEOP_JSONEXPR_BEHAVIOR;
> > + scratch->d.jsonexpr_behavior.jsestate = jsestate;
> > + behavior_step_off = state->steps_len;
> > + ExprEvalPushStep(state, scratch);
>
> From what I can tell there a) can never be a step between EEOP_JSONEXPR_PATH
> and EEOP_JSONEXPR_BEHAVIOR b) EEOP_JSONEXPR_PATH ends with an unconditional
> branch. What's the point of the two different steps here?

A separate BEHAVIOR step is needed to jump to when the coercion step
catches an error which must be handled with the appropriate ON ERROR
behavior.

> > + EEO_CASE(EEOP_JSONEXPR_PATH)
> > + {
> > + /* too complex for an inline implementation */
> > + ExecEvalJsonExpr(state, op, econtext);
> > + EEO_NEXT();
> > + }
>
> Why does EEOP_JSONEXPR_PATH call ExecEvalJsonExpr, the names don't match...

Renamed to ExecEvalJsonExprPath().

> > + EEO_CASE(EEOP_JSONEXPR_SKIP)
> > + {
> > + /* too complex for an inline implementation */
> > + EEO_JUMP(ExecEvalJsonExprSkip(state, op));
> > + }
> ...
>
>
> > + EEO_CASE(EEOP_JSONEXPR_COERCION_FINISH)
> > + {
> > + /* too complex for an inline implementation */
> > + EEO_JUMP(ExecEvalJsonExprCoercionFinish(state, op));
> > + }
>
> This seems to just return op->d.jsonexpr_coercion_finish.jump_coercion_error
> or op->d.jsonexpr_coercion_finish.jump_coercion_done. Which makes me think
> it'd be better to return a boolean? Particularly because that's how you
> already implemented it for JIT (except that you did it by hardcoding the jump
> step to compare to, which seems odd).
>
> Separately, why do we even need a jump for both cases, and not just for the
> error case?

Agreed. I've redesigned all of the steps so that we need to remember
only a couple of jump addresses in JsonExprState for hard-coded
jumping:

1. the address of the step that handles ON ERROR/EMPTY clause
(statically set during compilation)
2. the address of the step that evaluates coercion (dynamically set
depending on the type of the JSON value to coerce)

The redesign involved changing:

* What each step does
* Arranging steps in the order of operations that must be performed in
the following order:

1. compute formatted_expr
2. JUMP_IF_NULL (jumps to coerce the NULL result)
3. compute pathspec
4. JUMP_IF_NULL (jumps to coerce the NULL result)
5. compute PASSING arg expressions or noop
6. compute JsonPath{Exists|Query|Value} (hard-coded jump to step 9 if
error/empty or to appropriate coercion)
7. evaluate coercion (via expression or via IO in
ExecEvalJsonCoercionViaPopulateOrIO) ->
8. coercion finish
9. JUMP_IF_NOT_TRUE (error) (jumps to skip the next expression if !error)
10. ON ERROR expression
12. JUMP_IF_NOT_TRUE (empty) (jumps to skip the next expression if !empty)
13. ON EMPTY expression

There are also some unconditional JUMPs added in between above steps
to skip to end or the appropriate target address as needed.

> > + EEO_CASE(EEOP_JSONEXPR_BEHAVIOR)
> > + {
> > + /* too complex for an inline implementation */
> > + EEO_JUMP(ExecEvalJsonExprBehavior(state, op));
> > + }
> > +
> > + EEO_CASE(EEOP_JSONEXPR_COERCION)
> > + {
> > + /* too complex for an inline implementation */
> > + EEO_JUMP(ExecEvalJsonExprCoercion(state, op, econtext,
> > + *op->resvalue, *op->resnull));
> > + }
>
> I wonder if this is the right design for this op - you're declaring this to be
> op not worth implementing inline, yet you then have it implemented by hand for JIT.

This has been redesigned to not require the hard-coded jumps like these.

> > +/*
> > + * Evaluate given JsonExpr by performing the specified JSON operation.
> > + *
> > + * This also populates the JsonExprPostEvalState with the information needed
> > + * by the subsequent steps that handle the specified JsonBehavior.
> > + */
> > +void
> > +ExecEvalJsonExpr(ExprState *state, ExprEvalStep *op, ExprContext *econtext)
> > +{
> > + JsonExprState *jsestate = op->d.jsonexpr.jsestate;
> > + JsonExprPreEvalState *pre_eval = &jsestate->pre_eval;
> > + JsonExprPostEvalState *post_eval = &jsestate->post_eval;
> > + JsonExpr *jexpr = jsestate->jsexpr;
> > + Datum item;
> > + Datum res = (Datum) 0;
> > + bool resnull = true;
> > + JsonPath *path;
> > + bool throw_error = (jexpr->on_error->btype == JSON_BEHAVIOR_ERROR);
> > + bool *error = &post_eval->error;
> > + bool *empty = &post_eval->empty;
> > +
> > + item = pre_eval->formatted_expr.value;
> > + path = DatumGetJsonPathP(pre_eval->pathspec.value);
> > +
> > + /* Reset JsonExprPostEvalState for this evaluation. */
> > + memset(post_eval, 0, sizeof(*post_eval));
> > +
> > + switch (jexpr->op)
> > + {
> > + case JSON_EXISTS_OP:
> > + {
> > + bool exists = JsonPathExists(item, path,
> > + !throw_error ? error : NULL,
> > + pre_eval->args);
> > +
> > + post_eval->jcstate = jsestate->result_jcstate;
> > + if (*error)
> > + {
> > + *op->resnull = true;
> > + *op->resvalue = (Datum) 0;
> > + return;
> > + }
> > +
> > + resnull = false;
> > + res = BoolGetDatum(exists);
> > + break;
> > + }
>
> Kinda seems there should be a EEOP_JSON_EXISTS/JSON_QUERY_OP op, instead of
> implementing it all inside ExecEvalJsonExpr. I think this might obsolete
> needing to rediscover that the value is null in SKIP etc?

I tried but didn't really see the point of breaking
ExecEvalJsonExprPath() down into one step per JSON_*OP.

The skipping logic is based on the result of *2* input expressions
formatted_expr and pathspec which are computed before getting to
ExecEvalJsonExprPath(). Also, the skipping logic also allows to skip
the evaluation of PASSING arguments which also need to be computed
before ExecEvalJsonExprPath().

> > + case JSON_QUERY_OP:
> > + res = JsonPathQuery(item, path, jexpr->wrapper, empty,
> > + !throw_error ? error : NULL,
> > + pre_eval->args);
> > +
> > + post_eval->jcstate = jsestate->result_jcstate;
> > + if (*error)
> > + {
> > + *op->resnull = true;
> > + *op->resvalue = (Datum) 0;
> > + return;
> > + }
> > + resnull = !DatumGetPointer(res);
>
> Shoulnd't this check empty?

Fixed.

> FWIW, it's also pretty odd that JsonPathQuery() once
> return (Datum) 0;
> and later does
> return PointerGetDatum(NULL);

Yes, fixed to use the former style at all returns.

> > + case JSON_VALUE_OP:
> > + {
> > + JsonbValue *jbv = JsonPathValue(item, path, empty,
> > + !throw_error ? error : NULL,
> > + pre_eval->args);
> > +
> > + /* Might get overridden below by an item_jcstate. */
> > + post_eval->jcstate = jsestate->result_jcstate;
> > + if (*error)
> > + {
> > + *op->resnull = true;
> > + *op->resvalue = (Datum) 0;
> > + return;
> > + }
> > +
> > + if (!jbv) /* NULL or empty */
> > + {
> > + resnull = true;
> > + break;
> > + }
> > +
> > + Assert(!*empty);
> > +
> > + resnull = false;
> > +
> > + /* Coerce scalar item to the output type */
> > +
> > + /*
> > + * If the requested output type is json(b), use
> > + * JsonExprState.result_coercion to do the coercion.
> > + */
> > + if (jexpr->returning->typid == JSONOID ||
> > + jexpr->returning->typid == JSONBOID)
> > + {
> > + /* Use result_coercion from json[b] to the output type */
> > + res = JsonbPGetDatum(JsonbValueToJsonb(jbv));
> > + break;
> > + }
> > +
> > + /*
> > + * Else, use one of the item_coercions.
> > + *
> > + * Error out if no cast exists to coerce SQL/JSON item to the
> > + * the output type.
> > + */
> > + res = ExecPrepareJsonItemCoercion(jbv,
> > + jsestate->item_jcstates,
> > + &post_eval->jcstate);
> > + if (post_eval->jcstate &&
> > + post_eval->jcstate->coercion &&
> > + (post_eval->jcstate->coercion->via_io ||
> > + post_eval->jcstate->coercion->via_populate))
> > + {
> > + if (!throw_error)
> > + {
> > + *op->resnull = true;
> > + *op->resvalue = (Datum) 0;
> > + return;
> > + }
> > +
> > + /*
> > + * Coercion via I/O means here that the cast to the target
> > + * type simply does not exist.
> > + */
> > + ereport(ERROR,
> > + (errcode(ERRCODE_SQL_JSON_ITEM_CANNOT_BE_CAST_TO_TARGET_TYPE),
> > + errmsg("SQL/JSON item cannot be cast to target type")));
> > + }
> > + break;
> > + }
> > +
> > + default:
> > + elog(ERROR, "unrecognized SQL/JSON expression op %d", jexpr->op);
> > + *op->resnull = true;
> > + *op->resvalue = (Datum) 0;
> > + return;
> > + }
> > +
> > + /*
> > + * If the ON EMPTY behavior is to cause an error, do so here. Other
> > + * behaviors will be handled in ExecEvalJsonExprBehavior().
> > + */
> > + if (*empty)
> > + {
> > + Assert(jexpr->on_empty); /* it is not JSON_EXISTS */
> > +
> > + if (jexpr->on_empty->btype == JSON_BEHAVIOR_ERROR)
> > + {
> > + if (!throw_error)
> > + {
> > + *op->resnull = true;
> > + *op->resvalue = (Datum) 0;
> > + return;
> > + }
> > +
> > + ereport(ERROR,
> > + (errcode(ERRCODE_NO_SQL_JSON_ITEM),
> > + errmsg("no SQL/JSON item")));
> > + }
> > + }
> > +
> > + *op->resvalue = res;
> > + *op->resnull = resnull;
> > +}
> > +
> > +/*
> > + * Skip calling ExecEvalJson() on the given JsonExpr?
>
> I don't think that function exists.

Fixed.

> > + * Returns the step address to be performed next.
> > + */
> > +int
> > +ExecEvalJsonExprSkip(ExprState *state, ExprEvalStep *op)
> > +{
> > + JsonExprState *jsestate = op->d.jsonexpr_skip.jsestate;
> > +
> > + /*
> > + * Skip if either of the input expressions has turned out to be NULL,
> > + * though do execute domain checks for NULLs, which are handled by the
> > + * coercion step.
> > + */
> > + if (jsestate->pre_eval.formatted_expr.isnull ||
> > + jsestate->pre_eval.pathspec.isnull)
> > + {
> > + *op->resvalue = (Datum) 0;
> > + *op->resnull = true;
> > +
> > + /* ExecEvalJsonExprCoercion() depends on this. */
> > + jsestate->post_eval.jcstate = jsestate->result_jcstate;
> > +
> > + return op->d.jsonexpr_skip.jump_coercion;
> > + }
> > +
> > + /*
> > + * Go evaluate the PASSING args if any and subsequently JSON path itself.
> > + */
> > + return op->d.jsonexpr_skip.jump_passing_args;
> > +}
> > +
> > +/*
> > + * Returns the step address to perform the JsonBehavior applicable to
> > + * the JSON item that resulted from evaluating the given JsonExpr.
> > + *
> > + * Returns the step address to be performed next.
> > + */
> > +int
> > +ExecEvalJsonExprBehavior(ExprState *state, ExprEvalStep *op)
> > +{
> > + JsonExprState *jsestate = op->d.jsonexpr_behavior.jsestate;
> > + JsonExprPostEvalState *post_eval = &jsestate->post_eval;
> > + JsonBehavior *behavior = NULL;
> > + int jump_to = -1;
> > +
> > + if (post_eval->error || post_eval->coercion_error)
> > + {
> > + behavior = jsestate->jsexpr->on_error;
> > + jump_to = op->d.jsonexpr_behavior.jump_onerror_expr;
> > + }
> > + else if (post_eval->empty)
> > + {
> > + behavior = jsestate->jsexpr->on_empty;
> > + jump_to = op->d.jsonexpr_behavior.jump_onempty_expr;
> > + }
> > + else if (!post_eval->coercion_done)
> > + {
> > + /*
> > + * If no error or the JSON item is not empty, directly go to the
> > + * coercion step to coerce the item as is.
> > + */
> > + return op->d.jsonexpr_behavior.jump_coercion;
> > + }
> > +
> > + Assert(behavior);
> > +
> > + /*
> > + * Set up for coercion step that will run to coerce a non-default behavior
> > + * value. It should use result_coercion, if any. Errors that may occur
> > + * should be thrown for JSON ops other than JSON_VALUE_OP.
> > + */
> > + if (behavior->btype != JSON_BEHAVIOR_DEFAULT)
> > + {
> > + post_eval->jcstate = jsestate->result_jcstate;
> > + post_eval->coercing_behavior_expr = true;
> > + }
> > +
> > + Assert(jump_to >= 0);
> > + return jump_to;
> > +}
> > +
> > +/*
> > + * Evaluate or return the step address to evaluate a coercion of a JSON item
> > + * to the target type. The former if the coercion must be done right away by
> > + * calling the target type's input function, and for some types, by calling
> > + * json_populate_type().
> > + *
> > + * Returns the step address to be performed next.
> > + */
> > +int
> > +ExecEvalJsonExprCoercion(ExprState *state, ExprEvalStep *op,
> > + ExprContext *econtext,
> > + Datum res, bool resnull)
> > +{
> > + JsonExprState *jsestate = op->d.jsonexpr_coercion.jsestate;
> > + JsonExpr *jexpr = jsestate->jsexpr;
> > + JsonExprPostEvalState *post_eval = &jsestate->post_eval;
> > + JsonCoercionState *jcstate = post_eval->jcstate;
> > + char *val_string = NULL;
> > + bool omit_quotes = false;
> > +
> > + switch (jexpr->op)
> > + {
> > + case JSON_EXISTS_OP:
> > + if (jcstate && jcstate->jump_eval_expr >= 0)
> > + return jcstate->jump_eval_expr;
>
> Shouldn't this be a compile-time check and instead be handled by simply not
> emitting a step instead?

Yes and...

> > + /* No coercion needed. */
> > + post_eval->coercion_done = true;
> > + return op->d.jsonexpr_coercion.jump_coercion_done;
>
> Which then means we also don't need to emit anything here, no?

Yes.

Basically all jump address selection logic is now handled in
ExecInitJsonExpr() (compile-time) as described above.

> > +/*
> > + * Prepare SQL/JSON item coercion to the output type. Returned a datum of the
> > + * corresponding SQL type and a pointer to the coercion state.
> > + */
> > +static Datum
> > +ExecPrepareJsonItemCoercion(JsonbValue *item, List *item_jcstates,
> > + JsonCoercionState **p_item_jcstate)
>
> I might have missed it, but if not: The whole way the coercion stuff works
> needs a decent comment explaining how things fit together.
>
> What does "item" really mean here?

This term "item" I think refers to the JsonbValue returned by
JsonPathValue(), which can be one of jbvType types. Because we need
multiple coercions to account for that, I assume the original authors
decided to use the term/phrase "item coercions" to distinguish from
the result_coercion which assumes either a Boolean (EXISTS) or
Jsonb/text (QUERY) result.

> > +{
> > + JsonCoercionState *item_jcstate;
> > + Datum res;
> > + JsonbValue buf;
> > +
> > + if (item->type == jbvBinary &&
> > + JsonContainerIsScalar(item->val.binary.data))
> > + {
> > + bool res PG_USED_FOR_ASSERTS_ONLY;
> > +
> > + res = JsonbExtractScalar(item->val.binary.data, &buf);
> > + item = &buf;
> > + Assert(res);
> > + }
> > +
> > + /* get coercion state reference and datum of the corresponding SQL type */
> > + switch (item->type)
> > + {
> > + case jbvNull:
> > + item_jcstate = list_nth(item_jcstates, JsonItemTypeNull);
>
> This seems quite odd. We apparently have a fixed-length array, where specific
> offsets have specific meanings, yet it's encoded as a list that's then
> accessed with constant offsets?
>
> Right now ExecEvalJsonExpr() stores what ExecPrepareJsonItemCoercion() chooses
> in post_eval->jcstate. Which the immediately following
> ExecEvalJsonExprBehavior() then digs out again. Then there's also control flow
> via post_eval->coercing_behavior_expr. This is ... not nice.

Agree.

In the new code, the struct JsonCoercionState is gone. So any given
coercion boils down to a steps address during runtime, which is
determined by ExecEvalJsonExprPath().

> ISTM that jsestate should have an array of jump targets, indexed by
> item->type.

Yes, an array of jump targets seems better for these "item coercions".
result_coercion is a single address stored separately.

> Which, for llvm IR, you can encode as a switch statement, instead
> of doing control flow via JsonExprState/JsonExprPostEvalState. There's
> obviously a bit more needed, but I think something like that should work, and
> simplify things a fair bit.

Thanks for suggesting the switch-case idea. The LLVM IR for
EEOP_JSONEXPR_PATH now includes one to jump to one of the coercion
addresses between that for result_coercion and "item coercions" if
present.

> > @@ -15711,6 +15721,192 @@ func_expr_common_subexpr:
> > n->location = @1;
> > $$ = (Node *) n;
> > }
> > + | JSON_QUERY '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_wrapper_behavior
> > + json_quotes_clause_opt
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_QUERY_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->wrapper = $5;
> > + if (n->wrapper != JSW_NONE && $6 != JS_QUOTES_UNSPEC)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_SYNTAX_ERROR),
> > + errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
> > + parser_errposition(@6)));
> > + n->quotes = $6;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
> > + | JSON_QUERY '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_wrapper_behavior
> > + json_quotes_clause_opt
> > + json_query_behavior ON EMPTY_P
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_QUERY_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->wrapper = $5;
> > + if (n->wrapper != JSW_NONE && $6 != JS_QUOTES_UNSPEC)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_SYNTAX_ERROR),
> > + errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
> > + parser_errposition(@6)));
> > + n->quotes = $6;
> > + n->on_empty = $7;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
> > + | JSON_QUERY '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_wrapper_behavior
> > + json_quotes_clause_opt
> > + json_query_behavior ON ERROR_P
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_QUERY_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->wrapper = $5;
> > + if (n->wrapper != JSW_NONE && $6 != JS_QUOTES_UNSPEC)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_SYNTAX_ERROR),
> > + errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
> > + parser_errposition(@6)));
> > + n->quotes = $6;
> > + n->on_error = $7;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
> > + | JSON_QUERY '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_wrapper_behavior
> > + json_quotes_clause_opt
> > + json_query_behavior ON EMPTY_P
> > + json_query_behavior ON ERROR_P
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_QUERY_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->wrapper = $5;
> > + if (n->wrapper != JSW_NONE && $6 != JS_QUOTES_UNSPEC)
> > + ereport(ERROR,
> > + (errcode(ERRCODE_SYNTAX_ERROR),
> > + errmsg("SQL/JSON QUOTES behavior must not be specified when WITH WRAPPER is used"),
> > + parser_errposition(@6)));
> > + n->quotes = $6;
> > + n->on_empty = $7;
> > + n->on_error = $10;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
>
> I'm sure we can find a way to deduplicate this.
>
>
> > + | JSON_EXISTS '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + ')'
> > + {
> > + JsonFuncExpr *p = makeNode(JsonFuncExpr);
> > +
> > + p->op = JSON_EXISTS_OP;
> > + p->common = (JsonCommon *) $3;
> > + p->output = (JsonOutput *) $4;
> > + p->location = @1;
> > + $$ = (Node *) p;
> > + }
> > + | JSON_EXISTS '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_exists_behavior ON ERROR_P
> > + ')'
> > + {
> > + JsonFuncExpr *p = makeNode(JsonFuncExpr);
> > +
> > + p->op = JSON_EXISTS_OP;
> > + p->common = (JsonCommon *) $3;
> > + p->output = (JsonOutput *) $4;
> > + p->on_error = $5;
> > + p->location = @1;
> > + $$ = (Node *) p;
> > + }
> > + | JSON_VALUE '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_VALUE_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
> > +
> > + | JSON_VALUE '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_value_behavior ON EMPTY_P
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_VALUE_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->on_empty = $5;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
> > + | JSON_VALUE '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_value_behavior ON ERROR_P
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_VALUE_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->on_error = $5;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
> > +
> > + | JSON_VALUE '('
> > + json_api_common_syntax
> > + json_returning_clause_opt
> > + json_value_behavior ON EMPTY_P
> > + json_value_behavior ON ERROR_P
> > + ')'
> > + {
> > + JsonFuncExpr *n = makeNode(JsonFuncExpr);
> > +
> > + n->op = JSON_VALUE_OP;
> > + n->common = (JsonCommon *) $3;
> > + n->output = (JsonOutput *) $4;
> > + n->on_empty = $5;
> > + n->on_error = $8;
> > + n->location = @1;
> > + $$ = (Node *) n;
> > + }
> > ;
>
> And this.
>
>
>
> > +json_query_behavior:
> > + ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL, @1); }
> > + | NULL_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL, @1); }
> > + | DEFAULT a_expr { $$ = makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2, @1); }
> > + | EMPTY_P ARRAY { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL, @1); }
> > + | EMPTY_P OBJECT_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_OBJECT, NULL, @1); }
> > + /* non-standard, for Oracle compatibility only */
> > + | EMPTY_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_EMPTY_ARRAY, NULL, @1); }
> > + ;
>
>
> > +json_exists_behavior:
> > + ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL, @1); }
> > + | TRUE_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_TRUE, NULL, @1); }
> > + | FALSE_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_FALSE, NULL, @1); }
> > + | UNKNOWN { $$ = makeJsonBehavior(JSON_BEHAVIOR_UNKNOWN, NULL, @1); }
> > + ;
> > +
> > +json_value_behavior:
> > + NULL_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_NULL, NULL, @1); }
> > + | ERROR_P { $$ = makeJsonBehavior(JSON_BEHAVIOR_ERROR, NULL, @1); }
> > + | DEFAULT a_expr { $$ = makeJsonBehavior(JSON_BEHAVIOR_DEFAULT, $2, @1); }
> > + ;
>
> This also seems like it could use some dedup.
>
> > src/backend/parser/gram.y | 348 +++++-

I've given that a try and managed to reduce the gram.y footprint down to:

src/backend/parser/gram.y | 217 +++-

> This causes a nontrivial increase in the size of the parser (~5% in an
> optimized build here), I wonder if we can do better.

Hmm, sorry if I sound ignorant but what do you mean by the parser here?

I can see that the byte-size of gram.o increases by 1.66% after the
above additions (1.72% with previous versions). I've also checked
using log_parser_stats that there isn't much slowdown in the
raw-parsing speed.

Attached updated patch. The version of 0001 that I posted on Oct 11
to add the error-safe version of CoerceViaIO contained many
unnecessary bits that are now removed.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com

Attachment Content-Type Size
v24-0005-Claim-SQL-standard-compliance-for-SQL-JSON-featu.patch application/octet-stream 2.3 KB
v24-0001-Add-soft-error-handling-to-some-expression-nodes.patch application/octet-stream 9.1 KB
v24-0002-Add-soft-error-handling-to-populate_record_field.patch application/octet-stream 23.2 KB
v24-0004-JSON_TABLE.patch application/octet-stream 165.3 KB
v24-0003-SQL-JSON-query-functions.patch application/octet-stream 199.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jakub Wartak 2023-11-15 14:13:32 Re: trying again to get incremental backup
Previous Message Daniel Gustafsson 2023-11-15 12:53:13 Re: Fix documentation for pg_stat_statements JIT deform_counter