| From: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
|---|---|
| To: | Dmitry Dolgov <9erthalion6(at)gmail(dot)com> |
| Cc: | Peter Eisentraut <peter(dot)eisentraut(at)enterprisedb(dot)com>, Sergey Shinderuk <s(dot)shinderuk(at)postgrespro(dot)ru>, Tomas Vondra <tomas(dot)vondra(at)enterprisedb(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, dean(dot)a(dot)rasheed(at)gmail(dot)com, er(at)xs4all(dot)nl, joel(at)compiler(dot)org, pgsql-hackers(at)lists(dot)postgresql(dot)org |
| Subject: | Re: Schema variables - new implementation for Postgres 15 |
| Date: | 2023-12-03 05:04:12 |
| Message-ID: | CAFj8pRCr3FuDuqMjNo=LouV=H4t8L6o584gWsw5jxeyXknkFbA@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi
ne 26. 11. 2023 v 18:56 odesílatel Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
napsal:
> > On Sat, Nov 18, 2023 at 06:28:53PM +0100, Pavel Stehule wrote:
> > so 18. 11. 2023 v 15:54 odesílatel Dmitry Dolgov <9erthalion6(at)gmail(dot)com>
> > napsal:
> > > As a side note, I'm intended to go one more time through the first few
> > > patches introducing the basic functionality, and then mark it as ready
> > > in CF. I can't break the patch in testing since quite long time, and
> for
> > > most parts the changes make sense to me.
> >
> > I marked pg_session_variables function as PARALLEL RESTRICTED, and did
> > rebase
>
> So, after one week of uninterrupted evening reviews I've made it through
> the first four patches :)
>
> It's a decent job -- more than once, looking at the code, I thought I
> could construct a case when it's going to blow up, but everything was
> working just fine. Yet, I think the patch still has to be reshaped a bit
> before moving forward. I've got a couple proposals of different nature:
> high level changes (you probably won't like some of them, but I'm sure
> they're going to be useful), technical code-level improvements/comments,
> and few language changes. With those changes in mind I would be
> satisfied with the patch, and hopefully they would also make it easier
> for a potential committer to pick it up.
>
> # High level proposals
>
> * I would suggest reducing the scope of the patch as much as possible,
> and not just by trimming on the edges, but rather following Phileas
> Fogg's example with the steamboat Henrietta -- get rid of all
> non-essential parts. This will make this rather large patch more
> approachable for others.
>
> For that one can concentrate only on the first two patches plus the
> fourth one (memory cleanup after dropping variables), leaving DISCARD,
> ON TRANSACTION END, DEFAULT, IMMUTABLE for the follow-up in the
> future.
>
> Another thing in this context would be to evaluate plpgsql support for
> this feature. You know the use case better than me, how important it
> is? Is it an intrinsic part of the feature, or session variables could
> be still valuable enough even without plpgsql? From what I see
> postponing plgpsql will make everything about ~800 lines lighter (most
> likely more), and also allow to ignore couple of concerns about the
> implementation (about this later).
>
> * The new GUC session_variables_ambiguity_warning is definitely going to
> cause many objections, it's another knob to manage very subtle
> behaviour detail very few people will ever notice. I see the point
> behind warning about ambiguity, so probably it makes sense to bite the
> bullet and decide one way or another. The proposal is to warn always
> in potentially ambiguous situations, and if concerns are high about
> logging too much, maybe do the warning on lower logging levels.
>
> # Code-level observations
>
> * It feels a bit awkward to have varid assignment logic in a separate
> function, what about adding an argument with varid to
> CreateVariableDestReceiver? SetVariableDestReceiverVarid still could
> be used for CreateDestReceiver.
>
> /*
> * Initially create a DestReceiver object.
> */
> DestReceiver *
> CreateVariableDestReceiver(void)
>
> /*
> * Set parameters for a VariableDestReceiver.
> * Should be called right after creating the DestReceiver.
> */
> void
> SetVariableDestReceiverVarid(DestReceiver *self, Oid varid)
>
> * It's worth it to add a commentary here explaining why it's fine to use
> InvalidOid here:
>
> if (pstmt->commandType != CMD_UTILITY)
> - ExplainOnePlan(pstmt, into, es, query_string, paramLI,
> queryEnv,
> + ExplainOnePlan(pstmt, into, InvalidOid, es, query_string,
> paramLI, queryEnv,
> &planduration, (es->buffers ? &bufusage :
> NULL));
>
> My understanding is that since LetStmt is CMD_UTILITY, this branch
> will never be visited for a session variable.
>
> * IIUC this one is introduced to exclude session variables from the normal
> path with EXPR_KIND_UPDATE_TARGET:
>
> + EXPR_KIND_ASSIGN_VARIABLE, /* PL/pgSQL assignment target -
> disallow
> + * session
> variables */
>
> But the name doesn't sound right, maybe longer
> EXPR_KIND_UPDATE_TARGET_NO_VARS is better?
>
> * I'm curious about this one, which exactly part does this change cover?
>
> @@ -4888,21 +4914,43 @@ substitute_actual_parameters_mutator(Node *node,
> - if (param->paramkind != PARAM_EXTERN)
> + if (param->paramkind != PARAM_EXTERN &&
> + param->paramkind != PARAM_VARIABLE)
> elog(ERROR, "unexpected paramkind: %d", (int)
> param->paramkind);
>
> I've commented it out, but no tests were affected.
>
> * Does it mean there could be theoretically two LET statements at the
> same time with different command type, one CMD_UTILITY, one
> CMD_SELECT? Can it cause any issues?
>
> + /*
> + * Inside PL/pgSQL we don't want to execute LET statement as
> utility
> + * command, because it disallow to execute expression as simple
> + * expression. So for PL/pgSQL we have extra path, and we return
> SELECT.
> + * Then it can be executed by exec_eval_expr. Result is dirrectly
> assigned
> + * to target session variable inside PL/pgSQL LET statement
> handler. This
> + * is extra code, extra path, but possibility to get faster
> execution is
> + * too attractive.
> + */
> + if (stmt->plpgsql_mode)
> + return query;
> +
>
> * This probably requires more explanation, is warning the only reason
> for this change?
>
> + *
> + * The session variables should not be used as target of PL/pgSQL
> assign
> + * statement. So we should to use special parser expr kind, that
> disallow
> + * usage of session variables. This block unwanted (in this
> context)
> + * possible warning so target PL/pgSQL's variable shadows some
> session
> + * variable.
> */
> target = transformExpr(pstate, (Node *) cref,
> -
> EXPR_KIND_UPDATE_TARGET);
> +
> EXPR_KIND_ASSIGN_VARIABLE);
>
> * It would be great to have more commentaries here:
>
> typedef struct
> {
> DestReceiver pub;
> Oid varid;
> Oid typid;
> int32 typmod;
> int typlen;
> int slot_offset;
> int rows;
> } SVariableState;
>
> For example, why does it make sense to have a field rows, where we
> interested to only know the fact that there is exactly one column?
>
> * Why there is SetSessionVariableWithSecurityCheck, but no
> GetSessionVariableWithSecurityCheck? Instead, object_aclcheck is done
> in standard_ExecutorStart, which looks a bit out of place.
>
> * pg_session_variables -- you mention it exists only for testing. What
> about moving it out into a separate patch for the sake of slimming
> down? It looks like it's used only in tests for "memory cleanup"
> patch, maybe they could be restructured to not require this function.
>
> * Probably it's time to drop unnecessary historical notes, like this:
>
> * Note: originally we enhanced a list xact_recheck_varids here.
> Unfortunately
> * it was not safe and a little bit too complex, because the sinval
> callback
> * function can be called when we iterate over xact_recheck_varids list.
> * Another issue was the possibility of being out of memory when we
> enhanced
> * the list. So now we just switch flag in related entry sessionvars hash
> table.
> * We need to iterate over hash table on every sinval message, so extra two
> * iteration over this hash table is not significant overhead (and we skip
> * entries that don't require recheck). Now we do not have any memory
> allocation
> * in the sinval handler (This note can be removed before commit).
>
> * The second patch "Storage for session variables and SQL interface",
> mentions DISCARD command:
>
> /*
> * There is no guarantee of sessionvars being initialized, even when
> * receiving an invalidation callback, as DISCARD [ ALL | VARIABLES ]
> * destroys the hash table entirely.
> */
>
> This command is implemented in another patch later one, so this
> comment probably belong there.
>
> * This comment mentions a "direct access, without buffering":
>
> /*
> * Direct access to session variable (without buffering). Because
> * returned value can be used (without an assignement) after the
> * referenced session variables is updated, we have to use an copy
> * of stored value every time.
> */
> *op->resvalue = GetSessionVariableWithTypeCheck(op->d.vparam.varid,
>
> op->resnull,
>
> op->d.vparam.vartype);
>
> But GetSessionVariableWithTypeCheck goes through get_session_variable
> and searches in the hash table. What "buffering" means in this
> context?
>
> * GetSessionVariableWithTypeCheck(Oid varid, bool *isNull, Oid
> expected_typid)
>
> Should the "WithTypeCheck" part be an argument of the
> GetSessionVariable? To reduce the code duplication a bit.
>
> * Just out of curiosity, why TopTransactionContext?
>
> /*
> * Store domain_check extra in TopTransactionContext. When we are
> in
> * other transaction, the domain_check_extra cache is not valid
> * anymore.
> */
> if (svar->domain_check_extra_lxid != MyProc->lxid)
> svar->domain_check_extra = NULL;
>
> domain_check(svar->value, svar->isnull,
> svar->typid, &svar->domain_check_extra,
> TopTransactionContext);
>
> * In SVariableData it would be great to have more comments around
> freeval, domain_check_extra, domain_check_extra_lxid.
>
> * Nitpicking, but the term "shadowing" for ambiguity between a session
> variable and a table column might be confusing, one can imagine there
> is a connection between those two objects and one actively follows
> ("shadows") the other one.
>
> * The second patch "Storage for session variables and SQL interface"
> mentions in the documentation default and temporary variables:
>
> <para>
> The value of a session variable is local to the current session.
> Retrieving
> a variable's value returns either a <literal>NULL</literal> or a
> default
> value, unless its value has been set to something else in the current
> session using the <command>LET</command> command. The content of a
> variable
> is not transactional. This is the same as regular variables in PL
> languages.
> The session variables can be persistent or can be temporary. In both
> cases,
> the content of session variables is temporary and not shared (like an
> content of temporary tables).
> </para>
>
> They're implemented in the following patches, so it belongs there.
>
> * Nitpicking, maybe merge those two conditions together for readability?
>
> if (!needs_validation)
> return;
>
> /*
> * Reset, this flag here, before we start the validation. It can be
> set to
> * on by incomming sinval message.
> */
> needs_validation = false;
>
> if (!sessionvars)
> return;
>
> * This one is not very clear, what is the difference between "somewhere
> inside a transaction" and "at the end of a transaction"?
>
> /*
> * This routine can be called somewhere inside transaction or at an
> transaction
> * end. When atEOX argument is false, then we are inside
> transaction, and we
> * don't want to throw entries related to session variables dropped
> in current
> * transaction.
> */
>
> # Language topic
>
> Since this patch introduces a large body of documentation and
> commentaries, I think it would benefit from a native speaker review.
> I've stumbled upon few examples (attached with proposed wording, without
> a diff extension to not confuse the CF bot), but otherwise if anyone
> follows this thread, texts review is appreciated.
>
Thank you for your review. Next two weeks I'll not too much time to work
on this patch - I have to work on some commercial work, and the week is
Prague PgConf, so my reply will be slow. But after these events I'll
concentrate on this patch.
Regards
Pavel
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Thomas Munro | 2023-12-03 06:00:15 | Re: postgres_fdw test timeouts |
| Previous Message | Alexander Lakhin | 2023-12-03 05:00:00 | Re: postgres_fdw test timeouts |