Re: Schema variables - new implementation for Postgres 15

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Sergey Shinderuk <s(dot)shinderuk(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, dean(dot)a(dot)rasheed(at)gmail(dot)com, er(at)xs4all(dot)nl, joel(at)compiler(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org
Subject: Re: Schema variables - new implementation for Postgres 15
Date: 2023-12-03 05:04:12
Message-ID: CAFj8pRCr3FuDuqMjNo=LouV=H4t8L6o584gWsw5jxeyXknkFbA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi

ne 26. 11. 2023 v 18:56 odesílatel Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
napsal:

> > On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote:
> > so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
> > napsal:
> > > As a side note, I'm intended to go one more time through the first few
> > > patches introducing the basic functionality, and then mark it as ready
> > > in CF. I can't break the patch in testing since quite long time, and
> for
> > > most parts the changes make sense to me.
> >
> > I marked pg_session_variables function as PARALLEL RESTRICTED, and did
> > rebase
>
> So, after one week of uninterrupted evening reviews I've made it through
> the first four patches :)
>
> It's a decent job -- more than once, looking at the code, I thought I
> could construct a case when it's going to blow up, but everything was
> working just fine. Yet, I think the patch still has to be reshaped a bit
> before moving forward. I've got a couple proposals of different nature:
> high level changes (you probably won't like some of them, but I'm sure
> they're going to be useful), technical code-level improvements/comments,
> and few language changes. With those changes in mind I would be
> satisfied with the patch, and hopefully they would also make it easier
> for a potential committer to pick it up.
>
> # High level proposals
>
> * I would suggest reducing the scope of the patch as much as possible,
> and not just by trimming on the edges, but rather following Phileas
> Fogg's example with the steamboat Henrietta -- get rid of all
> non-essential parts. This will make this rather large patch more
> approachable for others.
>
> For that one can concentrate only on the first two patches plus the
> fourth one (memory cleanup after dropping variables), leaving DISCARD,
> ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the
> future.
>
> Another thing in this context would be to evaluate plpgsql support for
> this feature. You know the use case better than me, how important it
> is? Is it an intrinsic part of the feature, or session variables could
> be still valuable enough even without plpgsql? From what I see
> postponing plgpsql will make everything about ~800 lines lighter (most
> likely more), and also allow to ignore couple of concerns about the
> implementation (about this later).
>
> * The new GUC session_variables_ambiguity_warning is definitely going to
> cause many objections, it's another knob to manage very subtle
> behaviour detail very few people will ever notice. I see the point
> behind warning about ambiguity, so probably it makes sense to bite the
> bullet and decide one way or another. The proposal is to warn always
> in potentially ambiguous situations, and if concerns are high about
> logging too much, maybe do the warning on lower logging levels.
>
> # Code-level observations
>
> * It feels a bit awkward to have varid assignment logic in a separate
> function, what about adding an argument with varid to
> CreateVariableDestReceiver? SetVariableDestReceiverVarid still could
> be used for CreateDestReceiver.
>
> /*
> * Initially create a DestReceiver object.
> */
> DestReceiver *
> CreateVariableDestReceiver(void)
>
> /*
> * Set parameters for a VariableDestReceiver.
> * Should be called right after creating the DestReceiver.
> */
> void
> SetVariableDestReceiverVarid(DestReceiver *self, Oid varid)
>
> * It's worth it to add a commentary here explaining why it's fine to use
> InvalidOid here:
>
> if (pstmt->commandType != CMD_UTILITY)
> - ExplainOnePlan(pstmt, into, es, query_string, paramLI,
> queryEnv,
> + ExplainOnePlan(pstmt, into, InvalidOid, es, query_string,
> paramLI, queryEnv,
> &planduration, (es->buffers ? &bufusage :
> NULL));
>
> My understanding is that since LetStmt is CMD_UTILITY, this branch
> will never be visited for a session variable.
>
> * IIUC this one is introduced to exclude session variables from the normal
> path with EXPR_KIND_UPDATE_TARGET:
>
> + EXPR_KIND_ASSIGN_VARIABLE, /* PL/pgSQL assignment target -
> disallow
> + * session
> variables */
>
> But the name doesn't sound right, maybe longer
> EXPR_KIND_UPDATE_TARGET_NO_VARS is better?
>
> * I'm curious about this one, which exactly part does this change cover?
>
> @@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node,
> - if (param->paramkind != PARAM_EXTERN)
> + if (param->paramkind != PARAM_EXTERN &&
> + param->paramkind != PARAM_VARIABLE)
> elog(ERROR, "unexpected paramkind: %d", (int)
> param->paramkind);
>
> I've commented it out, but no tests were affected.
>
> * Does it mean there could be theoretically two LET statements at the
> same time with different command type, one CMD_UTILITY, one
> CMD_SELECT? Can it cause any issues?
>
> + /*
> + * Inside PL/pgSQL we don't want to execute LET statement as
> utility
> + * command, because it disallow to execute expression as simple
> + * expression. So for PL/pgSQL we have extra path, and we return
> SELECT.
> + * Then it can be executed by exec_eval_expr. Result is dirrectly
> assigned
> + * to target session variable inside PL/pgSQL LET statement
> handler. This
> + * is extra code, extra path, but possibility to get faster
> execution is
> + * too attractive.
> + */
> + if (stmt->plpgsql_mode)
> + return query;
> +
>
> * This probably requires more explanation, is warning the only reason
> for this change?
>
> + *
> + * The session variables should not be used as target of PL/pgSQL
> assign
> + * statement. So we should to use special parser expr kind, that
> disallow
> + * usage of session variables. This block unwanted (in this
> context)
> + * possible warning so target PL/pgSQL's variable shadows some
> session
> + * variable.
> */
> target = transformExpr(pstate, (Node *) cref,
> -
> EXPR_KIND_UPDATE_TARGET);
> +
> EXPR_KIND_ASSIGN_VARIABLE);
>
> * It would be great to have more commentaries here:
>
> typedef struct
> {
> DestReceiver pub;
> Oid varid;
> Oid typid;
> int32 typmod;
> int typlen;
> int slot_offset;
> int rows;
> } SVariableState;
>
> For example, why does it make sense to have a field rows, where we
> interested to only know the fact that there is exactly one column?
>
> * Why there is SetSessionVariableWithSecurityCheck, but no
> GetSessionVariableWithSecurityCheck? Instead, object_aclcheck is done
> in standard_ExecutorStart, which looks a bit out of place.
>
> * pg_session_variables -- you mention it exists only for testing. What
> about moving it out into a separate patch for the sake of slimming
> down? It looks like it's used only in tests for "memory cleanup"
> patch, maybe they could be restructured to not require this function.
>
> * Probably it's time to drop unnecessary historical notes, like this:
>
> * Note: originally we enhanced a list xact_recheck_varids here.
> Unfortunately
> * it was not safe and a little bit too complex, because the sinval
> callback
> * function can be called when we iterate over xact_recheck_varids list.
> * Another issue was the possibility of being out of memory when we
> enhanced
> * the list. So now we just switch flag in related entry sessionvars hash
> table.
> * We need to iterate over hash table on every sinval message, so extra two
> * iteration over this hash table is not significant overhead (and we skip
> * entries that don't require recheck). Now we do not have any memory
> allocation
> * in the sinval handler (This note can be removed before commit).
>
> * The second patch "Storage for session variables and SQL interface",
> mentions DISCARD command:
>
> /*
> * There is no guarantee of sessionvars being initialized, even when
> * receiving an invalidation callback, as DISCARD [ ALL | VARIABLES ]
> * destroys the hash table entirely.
> */
>
> This command is implemented in another patch later one, so this
> comment probably belong there.
>
> * This comment mentions a "direct access, without buffering":
>
> /*
> * Direct access to session variable (without buffering). Because
> * returned value can be used (without an assignement) after the
> * referenced session variables is updated, we have to use an copy
> * of stored value every time.
> */
> *op->resvalue = GetSessionVariableWithTypeCheck(op->d.vparam.varid,
>
> op->resnull,
>
> op->d.vparam.vartype);
>
> But GetSessionVariableWithTypeCheck goes through get_session_variable
> and searches in the hash table. What "buffering" means in this
> context?
>
> * GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid
> expected_typid)
>
> Should the "WithTypeCheck" part be an argument of the
> GetSessionVariable? To reduce the code duplication a bit.
>
> * Just out of curiosity, why TopTransactionContext?
>
> /*
> * Store domain_check extra in TopTransactionContext. When we are
> in
> * other transaction, the domain_check_extra cache is not valid
> * anymore.
> */
> if (svar->domain_check_extra_lxid != MyProc->lxid)
> svar->domain_check_extra = NULL;
>
> domain_check(svar->value, svar->isnull,
> svar->typid, &svar->domain_check_extra,
> TopTransactionContext);
>
> * In SVariableData it would be great to have more comments around
> freeval, domain_check_extra, domain_check_extra_lxid.
>
> * Nitpicking, but the term "shadowing" for ambiguity between a session
> variable and a table column might be confusing, one can imagine there
> is a connection between those two objects and one actively follows
> ("shadows") the other one.
>
> * The second patch "Storage for session variables and SQL interface"
> mentions in the documentation default and temporary variables:
>
> <para>
> The value of a session variable is local to the current session.
> Retrieving
> a variable's value returns either a <literal>NULL</literal> or a
> default
> value, unless its value has been set to something else in the current
> session using the <command>LET</command> command. The content of a
> variable
> is not transactional. This is the same as regular variables in PL
> languages.
> The session variables can be persistent or can be temporary. In both
> cases,
> the content of session variables is temporary and not shared (like an
> content of temporary tables).
> </para>
>
> They're implemented in the following patches, so it belongs there.
>
> * Nitpicking, maybe merge those two conditions together for readability?
>
> if (!needs_validation)
> return;
>
> /*
> * Reset, this flag here, before we start the validation. It can be
> set to
> * on by incomming sinval message.
> */
> needs_validation = false;
>
> if (!sessionvars)
> return;
>
> * This one is not very clear, what is the difference between "somewhere
> inside a transaction" and "at the end of a transaction"?
>
> /*
> * This routine can be called somewhere inside transaction or at an
> transaction
> * end. When atEOX argument is false, then we are inside
> transaction, and we
> * don't want to throw entries related to session variables dropped
> in current
> * transaction.
> */
>
> # Language topic
>
> Since this patch introduces a large body of documentation and
> commentaries, I think it would benefit from a native speaker review.
> I've stumbled upon few examples (attached with proposed wording, without
> a diff extension to not confuse the CF bot), but otherwise if anyone
> follows this thread, texts review is appreciated.
>

Thank you for your review. Next two weeks I'll not too much time to work
on this patch - I have to work on some commercial work, and the week is
Prague PgConf, so my reply will be slow. But after these events I'll
concentrate on this patch.

Regards

Pavel

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Thomas Munro 2023-12-03 06:00:15 Re: postgres_fdw test timeouts
Previous Message Alexander Lakhin 2023-12-03 05:00:00 Re: postgres_fdw test timeouts