Re: Schema variables - new implementation for Postgres 15

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: dean(dot)a(dot)rasheed(at)gmail(dot)com, er(at)xs4all(dot)nl, joel(at)compiler(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Schema variables - new implementation for Postgres 15
Date: 2022-10-31 20:27:21
Message-ID: CAFj8pRCk-AR50YUn9131du6G_DPcckm+uiD-MJozzGVO9au2Rw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

po 17. 10. 2022 v 5:17 odesílatel Julien Rouhaud <rjuju123(at)gmail(dot)com>
napsal:

> Hi,
>
> On Thu, Oct 13, 2022 at 07:41:32AM +0200, Pavel Stehule wrote:
> >
> > > I fixed the commit message of 0001 patch. Fixed shadowed variables too.
>
> Thanks!
>
> > >
> > > There is a partially open issue, where I and Julien are not sure about
> a
> > > solution, and we would like to ask for the community's opinion. I'll
> send
> > > this query in separate mail.
> > >
> >
> > rebased with simplified code related to usage of pfree function
>
> If anyone is curious the discussion happend at [1].
>
> I looked at the patchset, this time focusing on the LET command. Here at
> the
> comments I have for now:
>
> - gram.y
>
> @@ -11918,6 +11920,7 @@ ExplainableStmt:
> | CreateMatViewStmt
> | RefreshMatViewStmt
> | ExecuteStmt /*
> by default all are $$=$1 */
> + | LetStmt
> ;
>
> (and other similar places) the comment should be kept to the last statement
>

fixed

>
> Also, having LetStmt as an ExplainableStmt means it's allowed in a CTE:
>
> cte_list:
> common_table_expr
> { $$ = list_make1($1); }
> | cte_list ',' common_table_expr { $$ =
> lappend($1, $3); }
> ;
>
> common_table_expr: name opt_name_list AS opt_materialized '('
> PreparableStmt ')' opt_search_clause opt_cycle_clause
>
> And doing so hits this assert in transformWithClause:
>
> if (!IsA(cte->ctequery, SelectStmt))
> {
> /* must be a data-modifying statement */
> Assert(IsA(cte->ctequery, InsertStmt) ||
> IsA(cte->ctequery, UpdateStmt) ||
> IsA(cte->ctequery, DeleteStmt));
>
> pstate->p_hasModifyingCTE = true;
> }
>
> and I'm assuming it would also fail on this in transformLetStmt:
>
> + /* There can't be any outer WITH to worry about */
> + Assert(pstate->p_ctenamespace == NIL);
>
> I guess it makes sense to be able to explain a LetStmt (or using it in a
> prepared statement), so it should be properly handled in
> transformSelectStmt.
> Also, I don't see any test for a prepared LET statement, this should also
> be
> covered.
>

The LET statement doesn't return data, so it should be disallowed similar
like MERGE statement

I enhanced the regression test about PREPARE of the LET statement. I found
and fix the missing plan dependency of target variable of LET command

> - transformLetStmt:
>
> + varid = IdentifyVariable(names, &attrname, &not_unique);
>
> It would be nice to have a comment saying that the lock is acquired here
>

done

>
> + /* The grammar should have produced a SELECT */
> + if (!IsA(selectQuery, Query) ||
> + selectQuery->commandType != CMD_SELECT)
> + elog(ERROR, "unexpected non-SELECT command in LET
> command");
>
> I'm wondering if this should be an Assert instead, as the grammar shouldn't
> produce anything else no matter what how hard a user try.
>

done

>
> + /* don't allow multicolumn result */
> + if (list_length(exprList) != 1)
> + ereport(ERROR,
> + (errcode(ERRCODE_SYNTAX_ERROR),
> + errmsg("expression is not scalar value"),
> + parser_errposition(pstate,
> +
> exprLocation((Node *) exprList))));
>
> This isn't covered by any regression test and it probably should. It can
> be
> reached with something like
>
> LET myvar = (null::pg_class).*;
>
> The error message could also use a bit of improvement.
>

done - the error message is like related plpgsql error message

>
> I see that a_expr allows a select statement in parens, but this leads to a
> sublink which already has all the required protection to gurantee a single
> column, and a single row at most during execution. This one returns for
> non-scalar case:
>
> subquery must return only one column
>
> Maybe use something similar for it, like "expression must return only one
> column"? Similarly the error message in svariableStartupReceiver could be
> made
> more consistent with the related errors:
>
> + if (++outcols > 1)
> + elog(ERROR, "svariable DestReceiver can take only
> one attribute");
>

done

>
> While on svariableReceiver, I see that the current code assumes that the
> caller
> did everything right. That's the case right now, but it should still be
> made
> more robust in case future code (or extensions) is added. I'm thinking:
>
> - svariableState.rows. Currently not really used, should check that one
> and
> only one row is received in svariableReceiveSlot and
> svariableShutdownReceiver (if no row is received the variable won't be
> reset
> which should probably always happen once you setup an svariableReceiver)
>

done

> - svariableState.typid, typmod and typlen should be double checked with the
> given varid in svariableStartupReceiver.
>

done

> - svariableState.varid should be initialized with InvalidOid to avoid
> undefined
> behavior is caller forgets to set it.
>

svariableState is initialized by palloc0

>
> I'm also wondering if SetVariableDestReceiverParams() should have an assert
> like LockHeldByMe() for the given varid,

done

> and maybe an assert that the varid is
> a session variable, to avoid running a possibly expensive execution that
> will
>

done

> fail when receiving the slot. I think the function would be better named
> SetVariableDestReceiverVarid() or something like that.
>

done

>
> +void
> +ExecuteLetStmt(ParseState *pstate,
> + LetStmt *stmt,
> + ParamListInfo params,
> + QueryEnvironment *queryEnv,
> + QueryCompletion *qc)
> +{
> + [...]
> + /* run the plan to completion */
> + ExecutorRun(queryDesc, ForwardScanDirection, 2L, true);
>
> Why 2 rows? I'm assuming it's an attempt to detect queries that returns
> more
> than 1 row, but it should be documented. Note that as mentioned above the
> dest
> receiver currently doesn't check it, so this definitely needs to be fixed.
>

done + check + tests

>
> - IdentifyVariable:
>
> *attrname can be set even is no variable is identified. I guess that's ok
> as
> it avoids useless code, but it should probably be documented in the
> function
> header.
>

This is a side effect. The attrname is used only when the returned oid is
valid. I checked code, and
I extended comments on the function.

I am sending updated patch, next points I'll process tomorrow

>
> Also, the API doesn't look ideal. AFAICS the only reason this function
> doesn't
> error out in case of ambiguous name is that transformColumnRef may check
> if a
> given name shadows a variable when session_variables_ambiguity_warning is
> set.
> But since IdentifyVariable returns InvalidOid if the given list of
> identifiers
> is ambiguous, it seems that the shadow detection can fail to detect a
> shadowed
> reference if multiple variable would shadow the name:
>
> # CREATE TYPE ab AS (a integer, b integer);
> CREATE TYPE
> # CREATE VARIABLE v_ab AS ab;
> CREATE VARIABLE
>
> # CREATE TABLE v_ab (a integer, b integer);
> CREATE TABLE
>
> # SET session_variables_ambiguity_warning = 1;
> SET
>
> # sELECT v_ab.a FROM v_ab;
> WARNING: 42702: session variable "v_ab.a" is shadowed
> LINE 1: select v_ab.a from v_ab;
> ^
> DETAIL: Session variables can be shadowed by columns, routine's variables
> and routine's arguments with the same name.
> a
> ---
> (0 rows)
>
> # CREATE SCHEMA v_ab;
> CREATE SCHEMA
>
> # CREATE VARIABLE v_ab.a AS integer;
> CREATE VARIABLE
>
> # SELECT v_ab.a FROM v_ab;
> a
> ---
> (0 rows)
>
>
> Note that a bit later in transformColumnRef(), not_unique is checked only
> if
> the returned varid is valid, which isn't correct as InvalidOid is currently
> returned if not_unique is set.
>
> I think that the error should be raised in IdentifyVariable rather than
> having
> every caller check it. I'm not sure how to perfectly handle the
> session_variables_ambiguity_warning though. Maybe make not_unique
> optional,
> and error out if not_unique is null. If not null, set it as necessary and
> return one of the oid. The only use would be for shadowing detection, and
> in
> that case it won't be possible to check if a warning can be avoided as it
> would
> be if no amgibuity is found, but that's probably ok.
>
> Or maybe instead LookupVariable should have an extra argument to only match
> variable with a composite type if caller asks to. This would avoid
> scenarios
> like:
>
> CREATE VARIABLE myvar AS int;
> SELECT myvar.blabla;
> ERROR: 42809: type integer is not composite
>
> Is that really ok to match a variable here rather than complaining about a
> missing FROM-clause?
>
> + indirection_start = list_length(names) - (attrname ? 1 : 0);
> + indirection = list_copy_tail(stmt->target, indirection_start);
> + [...]
> + if (indirection != NULL)
> + {
> + bool targetIsArray;
> + char *targetName;
> +
> + targetName = get_session_variable_name(varid);
> + targetIsArray =
> OidIsValid(get_element_type(typid));
> +
> + pstate->p_hasSessionVariables = true;
> +
> + coerced_expr = (Expr *)
> + transformAssignmentIndirection(pstate,
> +
> (Node *) param,
> +
> targetName,
> +
> targetIsArray,
> +
> typid,
> +
> typmod,
> +
> InvalidOid,
> +
> indirection,
> +
> list_head(indirection),
> +
> (Node *) expr,
> +
> COERCION_PLPGSQL,
> +
> stmt->location);
> + }
>
> I'm not sure why you use this approach rather than just having something
> like
> "ListCell *indirection_head", set it to a non-NULL value when needed, and
> use
> that (with names) instead. Note that it's also not correct to compare a
> List
> to NULL, use NIL instead.
>
> - expr_kind_allows_session_variables
>
> Even if that's a bit annoying, I think it's better to explicitly put all
> values
> there rather than having a default clause.
>
> For instance, EXPR_KIND_CYCLE_MARK is currently allowing session variables,
> which doesn't look ok. It's probably just an error from when the patchset
> was
> rebased, but this probably wouldn't happen if you get an error for an
> unmatched
> value if you add a new expr kind (which doesn't happen that often).
>

+ fixed issue reported by Dmitry Dolgov

>
> [1]
> https://www.postgresql.org/message-id/CAFj8pRB2+pVBFsidS-AzhHdZid40OTUspWfXS0vgahHmaWosZQ@mail.gmail.com
>

Attachment Content-Type Size
0007-possibility-to-dump-session-variables-by-pg_dump.patch text/x-patch 19.9 KB
0010-documentation.patch text/x-patch 43.3 KB
0008-regress-tests-for-session-variables.patch text/x-patch 57.6 KB
0006-enhancing-psql-for-session-variables.patch text/x-patch 15.2 KB
0009-this-patch-changes-error-message-column-doesn-t-exis.patch text/x-patch 23.7 KB
0005-DISCARD-VARIABLES-command.patch text/x-patch 3.2 KB
0004-support-of-LET-command-in-PLpgSQL.patch text/x-patch 11.9 KB
0003-LET-command.patch text/x-patch 45.1 KB
0002-session-variables.patch text/x-patch 101.7 KB
0001-catalog-support-for-session-variables.patch text/x-patch 101.9 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Ilya Gladyshev 2022-10-31 21:52:19 Re: Segfault on logical replication to partitioned table with foreign children
Previous Message Jonathan S. Katz 2022-10-31 20:27:08 User functions for building SCRAM secrets