Re: Schema variables - new implementation for Postgres 15

From: Pavel Stehule <pavel(dot)stehule(at)gmail(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Dean Rasheed <dean(dot)a(dot)rasheed(at)gmail(dot)com>, Joel Jacobson <joel(at)compiler(dot)org>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Schema variables - new implementation for Postgres 15
Date: 2022-01-28 06:51:08
Message-ID: CAFj8pRCZ9fVAC6h-s8fHS9muXmPwvHcwAX3omRJm9vUaQTMUcA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

st 26. 1. 2022 v 8:23 odesílatel Julien Rouhaud <rjuju123(at)gmail(dot)com> napsal:

> Hi,
>
> On Tue, Jan 25, 2022 at 10:53:00PM +0100, Pavel Stehule wrote:
> >
> > út 25. 1. 2022 v 6:18 odesílatel Julien Rouhaud <rjuju123(at)gmail(dot)com>
> napsal:
> > >
> > > First, I don't think that acquiring the lock in
> > > get_session_variable_type_typmod_collid() and
> > > prepare_variable_for_reading() is
> > > the correct approach. In transformColumnRef() and transformLetStmt()
> you
> > > first
> > > call IdentifyVariable() to check if the given name is a variable
> without
> > > locking it and later try to lock the variable if you get a valid Oid.
> > > This is
> > > bug prone as any other backend could drop the variable between the two
> > > calls
> > > and you would end up with a cache lookup failure. I think the lock
> should
> > > be
> > > acquired during IdentifyVariable. It should probably be optional as
> one
> > > codepath only needs the information to raise a warning when a variable
> is
> > > shadowed, so a concurrent drop isn't a problem there.
> > >
> >
> > I moved lock to IdentifyVariable routine
>
> +IdentifyVariable(List *names, char **attrname, bool lockit, bool
> *not_unique)
> +{
> [...]
> + return varoid_without_attr;
> + }
> + else
> + {
> + *attrname = c;
> + return varoid_with_attr;
> [...]
> +
> + if (OidIsValid(varid) && lockit)
> + LockDatabaseObject(VariableRelationId, varid, 0, AccessShareLock);
> +
> + return varid;
>
> There are still some code paths that may not lock the target variable when
> required.
>

fixed

>
> Also, the function comment doesn't say much about attrname handling, it
> should
> be clarifed. I think it should initially be set to NULL, to make sure that
> it's always a valid pointer after the function returns.
>

done

>
>
> > attached updated patch
>

> Various comments on the patch:
>
> No test for GRANT/REVOKE ... ALL VARIABLES IN SCHEMA, maybe it would be
> good to
> have one?
>

done

>
> Documentation:
>
> catalogs.sgml:
>
> You're still using the old-style 4 columns table, it should be a single
> column
> like the rest of the file.
>

done

>
> + <para>
> + The <command>CREATE VARIABLE</command> command creates a session
> variable.
> + Session variables, like relations, exist within a schema and their
> access is
> + controlled via <command>GRANT</command> and <command>REVOKE</command>
> + commands. Changing a session variable is non-transactional.
> + </para>
>
> The "changing a session variable is non-transactional" is ambiguous. I
> think
> that only the value part isn't transactional, the variable metadata
> themselves
> (ALTER VARIABLE and other DDL) are transactional right? This should be
> explicitly described here (although it's made less ambiguous in the next
> paragraph).
>

sure, DDL of session variables are transactional. I removed this sentence.

> + <para>
> + Session variables are retrieved by the <command>SELECT</command> SQL
> + command. Their value is set with the <command>LET</command> SQL
> command.
> + While session variables share properties with tables, their value
> cannot be
> + updated with an <command>UPDATE</command> command.
> + </para>
>
> should this part mention that session variables can be shadowed? For now
> the
> only mention to that is in advanced.sgml.
>

good idea, I wrote note about it there

>
> + The <literal>DEFAULT</literal> clause can be used to assign a
> default
> + value to a session variable.
>
> The expression is lazily evaluated during the session first use of the
> variable. This should be documented as any usage of volatile expression
> will
> be impacted.
>

done

>
> + The <literal>ON TRANSACTION END RESET</literal>
> + clause causes the session variable to be reset to its default value
> when
> + the transaction is committed or rolled back.
>
> As far as I can see this clauses doesn't play well with IMMUTABLE
> VARIABLE, as
> you can reassign a value once the transaction ends. Same for DISCARD [
> ALL |
> VARIABLES ], or LET var = NULL (or DEFAULT if no default value). Is that
> intended?
>

I think so it is expected. The life scope of assigned (immutable) value is
limited to transaction (when ON TRANSACTION END RESET).
DISCARD is used for reset of session, and after it, you can write the value
first time.

I enhanced doc in IMMUTABLE clause

> + <literal>LET</literal> extends the syntax defined in the SQL
> + standard. The <literal>SET</literal> command from the SQL standard
> + is used for different purposes in
> <productname>PostgreSQL</productname>.
>
> I don't fully understand that. Are (session) variables defined in the SQL
> standard? If yes, all the other documentation pages should clarify that as
> they currently say that this is a postgres extension. If not, this part
> should
> made it clear what is defined in the standard.
>

I reread standard more carefully, and it looks so SQL/PSM doesn't define
global variables ever. The modules defined by SQL/PSM can holds only
temporal tables or routines. Unfortunately, this part of standard is almost
dead, and there is not referential implementation. The most near to
standard in this area is DB2, but global session variables are proprietary
feature. The usage is very similar to our session variables with one
significant difference - the global session variables can be modified by
commands SELECT INTO, VALUES INTO, EXECUTE INTO and SET (Our session
variables can be modified just by LET command.). I am sure, so if SQL/PSM
supports global session variables, then it uses SET statement - like DB2,
but I didn't find any note about support in standard.

I think so the best comment to compatibility is just

<para>
The <command>LET</command> is a <productname>PostgreSQL</productname>
extension.
</para>

>
> In revoke.sgml:
> + REVOKE [ GRANT OPTION FOR ]
> + { { READ | WRITE } [, ...] | ALL [ PRIVILEGES ] }
> + ON VARIABLE <replaceable>variable_name</replaceable> [, ...]
> + FROM { [ GROUP ] <replaceable
> class="parameter">role_name</replaceable> | PUBLIC } [, ...]
> + [ CASCADE | RESTRICT ]
>
> there's no extra documentation for that, and therefore no clarification on
> variable_name.
>

This is same like function_name, domain_name, ...

>
> VariableIsVisible():
> + * If it is in the path, it might still not be visible; it
> could be
> + * hidden by another relation of the same name earlier in
> the path. So
> + * we must do a slow check for conflicting relations.
>
> should it be "another variable of the same name"?
>
>
yes, fixed

>
> Tab completion: CREATE IMMUTABLE VARIABLE is not handled
>

fixed

>
>
> pg_variable.c:
> Do we really need both session_variable_get_name() and
> get_session_variable_name()?
>

They are different - first returns possibly qualified name, second returns
only name. Currently it is used just for error messages in
transformAssignmentIndirection, and I think so it is good for consistency
with other usage of this routine (transformAssignmentIndirection).

>
> +/*
> + * Fetch all fields of session variable from the syscache.
> + */
> +void
> +initVariable(Variable *var, Oid varid, bool missing_ok, bool fast_only)
>
> As least fast_only should be documented in the function comment, especially
> regarding var->varname, since:
>
> + var->oid = varid;
> + var->name = pstrdup(NameStr(varform->varname));
> [...]
> + if (!fast_only)
> + {
> + Datum aclDatum;
> + bool isnull;
> +
> + /* name */
> + var->name = pstrdup(NameStr(varform->varname));A
> [...]
> + else
> + {
> + var->name = NULL;
>
> is the output value guaranteed or not? In any case it shouldn't be set
> twice.
>

It was messed, fixed

>
> Also, I don't see any caller for missing_ok == true, should we remove it?
>

removed

>
> +/*
> + * Create entry in pg_variable table
> + */
> +ObjectAddress
> +VariableCreate(const char *varName,
> [...]
> + /* dependency on any roles mentioned in ACL */
> + if (varacl != NULL)
> + {
> + int nnewmembers;
> + Oid *newmembers;
> +
> + nnewmembers = aclmembers(varacl, &newmembers);
> + updateAclDependencies(VariableRelationId, varid, 0,
> + varOwner,
> + 0, NULL,
> + nnewmembers, newmembers);
>
> Shouldn't you use recordDependencyOnNewAcl() instead? Also, sn't it
> missing a
> recordDependencyOnOwner()?
>

changed and fixed

>
> sessionvariable.c:
>
> + * Although session variables are not transactional, we don't
> + * want (and we cannot) to run cleaning immediately (when we
> + * got sinval message). The value of session variables can
> + * be still used or the operation that emits cleaning can be
> + * reverted. Unfortunatelly, this check can be done only in
> + * when transaction is committed (the check against system
> + * catalog requires transaction state).
>
> This was the original idea, but since there's now locking to make all DDL
> safe,
> the metadata should be considered fully transactional and no session should
> still be able to use a concurrently dropped variable. Also, the
> invalidation
> messages are not sent until the transaction is committed. So is that
> approach
> still needed (at least for things outside ON COMMIT DROP / ON TRANSACTION
> END
> RESET)?
>

I enhanced comment

>
> I'm also attaching a 3rd patch with some proposition for documentation
> rewording (including consistent use of *session* variable), a few comments
> rewording, copyright year bump and minor things like that.
>

Thank you very much for it. This patch is based on your changes.

Regards

Pavel

>
> Note that I still didn't really review pg_variable.c or sessionvariable.c
> since
> there might be significant changes there for either the sinval / immutable
> part
> I mentioned.
>

Attachment Content-Type Size
v20220128-0002-column-doesn-t-exists-message.patch text/x-patch 25.2 KB
v20220128-0001-session-variables.patch text/x-patch 322.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Bharath Rupireddy 2022-01-28 06:54:48 Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Previous Message Dilip Kumar 2022-01-28 06:46:33 Re: [BUG]Update Toast data failure in logical replication