From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
---|---|
To: | Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com> |
Cc: | Erik Rijkers <er(at)xs4all(dot)nl>, Michael Paquier <michael(at)paquier(dot)xyz>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, DUVAL REMI <REMI(dot)DUVAL(at)cheops(dot)fr>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: proposal: schema variables |
Date: | 2024-08-27 06:15:42 |
Message-ID: | 04ec666686e9e21cb515617df06885c66f3d34ce.camel@cybertec.at |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers pgsql-performance |
On Thu, 2024-08-15 at 07:55 +0200, Pavel Stehule wrote:
> út 30. 7. 2024 v 21:46 odesílatel Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> napsal:
> > - A general reminder: single line comments should start with a lower case
> > letter and have to period in the end:
>
> Should it be "have not to period in the end" ?
I made a typo. I mean "have *no* period in the end".
> I fixed almost all without parts related to psql and executor - there almost all
> current comments break the mentioned rule. So I use the style used in these files.
> I can fix it (if you like it) - or it can be fixed generally in a separated patch set.
Thanks. I also noticed that a lot of existing comments break that rule,
so I think that it is fine if you stick with the way that the surrounding
code does it.
> > - Variable as parameter:
> >
> > CREATE VARIABLE var AS date;
> > LET var = current_date;
> > PREPARE stmt(date) AS SELECT $1;
> > EXECUTE stmt(var);
> > ERROR: paramid of PARAM_VARIABLE param is out of range
> >
> > Is that working as intended? I don't understand the error message.
>
> fixed in 0002 patch (variables cannot be used as EXECUTE argument) and 0014 (enable usage variables as an argument of EXECUTE)
Thanks.
> > > --- a/src/backend/catalog/namespace.c
> > > +++ b/src/backend/catalog/namespace.c
> > > [...]
> > > +/*
> > > + * IdentifyVariable - try to find variable identified by list of names.
> > > [...]
> >
> > Perhaps part of the reason why this function is so complicated is that
> > you support things like this:
> >
> > CREATE SCHEMA sch;
> > CREATE TYPE cp AS (a integer, b integer);
> > CREATE VARIABLE sch.v AS cp;
> > LET sch.v = (1, 2);
> > SELECT sch.v.b;
> > b
> > ═══
> > 2
> > (1 row)
> >
> > test=# SELECT test.sch.v.b;
> > b
> > ═══
> > 2
> > (1 row)
> >
> > We don't support that for tables:
> >
> > CREATE TABLE tab (c cp);
> > INSERT INTO tab VALUES (ROW(42, 43));
> > SELECT tab.c.b FROM tab;
> > ERROR: cross-database references are not implemented: tab.c.b
> >
> > You have to use extra parentheses:
> >
> > SELECT (tab.c).b FROM tab;
> > b
> > ════
> > 43
> > (1 row)
> >
> > I'd say that this should be the same for session variables.
> > What do you think?
>
> I prefer the current state, but I don't have a very strong opinion about it.
> It can save 115 lines of almost trivial code, but I think the user experience
> can be much worse. Using composite types in tables is not too common a
> pattern (and when I read some recommendations for Oracle, it is an antipattern),
> but usage of composite variables is common (it can hold a row). Moreover,
> we talked long about possible identifier's collisions, and the pattern
> schema.variable is very good protection against possible collisions.
> Probably the pattern catalog.schema.variable.field is not too interesting
> but the support has 50 lines.
>
> More, the plpgsql supports pattern label.variable.field, so can be little bit
> unfriendly if session variables doesn't support "similar" pattern
I see your point, and I'm fine with leaving it as it is.
>
>
> > - src/backend/commands/session_variable.c
> >
> > There is one comment that confuses me and that I did not edit:
> >
> > > +Datum
> > > +GetSessionVariable(Oid varid, bool *isNull, Oid *typid)
> > > +{
> > > + SVariable svar;
> > > +
> > > + svar = get_session_variable(varid);
> > > +
> > > + /*
> > > + * Although svar is freshly validated in this point, the svar->is_valid can
> > > + * be false, due possible accepting invalidation message inside domain
> > > + * check. Now, the validation is done after lock, that can also accept
> > > + * invalidation message, so validation should be trustful.
> > > + *
> > > + * For now, we don't need to repeat validation. Only svar should be valid
> > > + * pointer.
> > > + */
>
> This comment is related to assertions. Before I had there `Assert(svar->is_valid)`,
> because I expected it. But it was not always true. And although it is true,
> we don't need to validate a variable, because at this moment, the variable
> should be locked, and then we can return content safely.
I guess my main problem is the word "trustful". I don't recognize that word.
Perhaps you can reword the comment along the lines of your above explanation.
>
> > - src/backend/executor/execMain.c
> >
> > > @@ -196,6 +198,51 @@ standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
> > > Assert(queryDesc->sourceText != NULL);
> > > estate->es_sourceText = queryDesc->sourceText;
> > >
> > > + /*
> > > + * The executor doesn't work with session variables directly. Values of
> > > + * related session variables are copied to dedicated array, and this array
> > > + * is passed to executor.
> > > + */
> >
> > Why? Is that a performance measure? The comment should explain that.
>
> Session variables are volatile internally - some function that uses LET statements
> can be called more times inside a query. This behavior is not consistent with
> plpgsql's variables or external parameters. And if we want to support parallel
> queries, then volatile session variables can be pretty messy (and unusable).
> If we want the same behaviour for queries with or without parallel support,
> then the session variables should be "stable" (like stable functions).
> Simple implementation is using "snapshot" of values of used session variables
> when query is started. This "snapshot" is immutable, so we don't need to make
> a copy more times.
>
> I changed this comment
Thanks.
> > - parallel safety
> >
> > > --- a/src/backend/optimizer/util/clauses.c
> > > +++ b/src/backend/optimizer/util/clauses.c
> > > @@ -935,6 +936,13 @@ max_parallel_hazard_walker(Node *node, max_parallel_hazard_context *context)
> > > if (param->paramkind == PARAM_EXTERN)
> > > return false;
> > >
> > > + /* We doesn't support passing session variables to workers */
> > > + if (param->paramkind == PARAM_VARIABLE)
> > > + {
> > > + if (max_parallel_hazard_test(PROPARALLEL_RESTRICTED, context))
> > > + return true;
> > > + }
> >
> > Even if a later patch alleviates that restriction, this patch should document that
> > session variables imply "parallel restricted". I have added that to doc/src/sgml/parallel.sgml.
> >
>
> merged (and removed in 0015)
Thanks.
I hope I can do some more review at some point in the future...
I sincerely hope that this patch set gets merged at some point.
The big obstacle is that it is so large. That's probably because it is pretty
feature-complete (but, as we have found, not totally free of bugs).
Judging from the amount of time I put into my review so far, I guess that this
patch set would keep a committer busy for several weeks. Perhaps the only way to
get this done would be to make you a committer...
Yours,
Laurenz Albe
From | Date | Subject | |
---|---|---|---|
Next Message | chungui.wcg | 2024-08-27 06:43:51 | Re:allowing extensions to control planner behavior |
Previous Message | Justin Clift | 2024-08-27 06:00:13 | Re: Significant Execution Time Difference Between PG13.14 and PG16.4 for Query on information_schema Tables. |
From | Date | Subject | |
---|---|---|---|
Next Message | Pavel Stehule | 2024-08-27 06:52:27 | Re: proposal: schema variables |
Previous Message | Scott Ribe | 2024-08-23 16:08:49 | Re: checking for a NULL date in a partitioned table kills performance |