| From: | "Alberto Piai" <alberto(dot)piai(at)gmail(dot)com> |
|---|---|
| To: | "Alberto Piai" <alberto(dot)piai(at)gmail(dot)com>, <pgsql-hackers(at)postgresql(dot)org> |
| Cc: | Álvaro Herrera <alvherre(at)kurilemu(dot)de>, "Laurenz Albe" <laurenz(dot)albe(at)cybertec(dot)at> |
| Subject: | Re: Adding a stored generated column without long-lived locks |
| Date: | 2026-05-27 17:43:53 |
| Message-ID: | DITN90YX3328.2A0I3LCV164BA@gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hi,
On Fri May 15, 2026 at 12:46 AM CEST, Alberto Piai wrote:
> On Fri Apr 24, 2026 at 2:10 AM PDT, Alberto Piai wrote:
>> On Tue Apr 7, 2026 at 5:02 PM +08, Alberto Piai wrote:
>>> On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
>>>
>>>> I recently needed to add a stored generated column to a table of
>>>> nontrivial size, and realized that currently there is no way to do
>>>> that without rewriting the table under an AccessExclusiveLock.
here's a not-so-brief summary of the conversations around this topic at
pgconf.dev, and a new proposal at the end.
I had the chance to bring this up with other attendees, and many
recognized the use case as a useful one, addressing a real operational
issue.
In particular, I had great feedback from Staš Kotarac Guček, who pointed
out a major flaw in my current proposal: a constraint of the form
CHECK (c = expr)
would not work correctly when expr evaluates to null for some rows.
Thank you Staš, in the next iteration I will change the constraint to
use IS NOT DISTINCT FROM, instead.
I briefly mentioned this topic to Tom Lane, who quickly replied with the
question: should this not fail when it can't use the constraint, instead
of overwriting the contents of the column?
Thanks Tom, I will get to this later in this mail.
I had registered this patch for the in-person commitfest at pgconf.dev,
and Álvaro Herrera picked it up for review. Thank you Álvaro, and thank
you Peter for organizing the event.
We managed to find some time on the very last day of the conference, and
went through the current design and code. The open items (which I will
address in the next iteration of this patch) are:
* missing user documentation
I will work on this next. I think it's a good way to explain the
feature even early during development. I just didn't want to do it
_too_ early, without having had any feedback.
* try to minimize command counter increments
There might be one call to CommandCounterIncrement() which is not
necessary, I'll try remove it.
* comment on why it is necessary to clear missing values when rewriting
the table
ATExecAlterColumnType() and ATExecSetExpression() both do this
explicitly when requesting a table rewrite. I'll extend the comment,
and also look into whether this is something that should be done any
time a table rewrite happens. In that case, it might be worth moving
this into the rewriting code rather than having each caller do it.
* interactions with other subcommands in the same alter table statement
My reasoning regarding this was: if I do this in
AT_PASS_SET_EXPRESSION, it should be safe. I will invest some more
time into this and add tests, too.
We also looked at the overall design of the new command, and we agreed
that it is a fitting addition to our current SET EXPRESSION and DROP
EXPRESSION. Regarding the question of whether it should be SET or ADD,
we agreed that ADD (i.e. the current proposal) is clearer, especially
for its similarity to ADD GENERATED { ALWAYS | BY DEFAULT } AS IDENTITY.
Regarding the question of "should this fail or rewrite the table when a
usable constraint isn't found": Álvaro's suggestion here was to use a
more ad-hoc command, meant more specifically for this use case of
converting into a stored generated column without rewriting it. If the
command would be dedicated specifically to this, it would make sense to
have it fail when a usable constraint isn't found.
Last but not least, I also discussed this with Laurenz Albe, and he
wrote a very useful review in this thread. I will address that
separately and reply directly to that mail, but one point I can already
merge in this discussion is about the syntax of the command:
> 1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
> and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
> for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
> So this patch adds non-standard syntax that may one day conflict with
> a new version of the standard. I think we can still do it, and the
> proposed syntax looks right, but I thought I should mention it.
I'd like to take his point, together with the question from Tom and the
suggestion by Álvaro, and make a new proposal for the design of this
command.
Design iteration 2
------------------
Syntax:
ALTER TABLE t ALTER COLUMN c
ADD GENERATED ALWAYS AS (expr) STORED USING CONSTRAINT check_name
check_name must be a valid constraint of the form
CHECK (c IS NOT DISTINCT FROM (expr))
This fails if:
- a check constraint named check_name is not found for table c
- the constraint is not valid
- the constraint does not match exactly the expr the user intends to use
as a stored default expression
On success, the table c is now a stored generated column with the given
default expression, and the check_name constraint has been removed.
This addresses Tom's remark, we can now fail instead of just rewriting
the column.
It improves slightly upon the issue of a potential conflict with a
future edition of the SQL standard, by being more specific. I don't see
a way to be completely sure we won't have conflicts. We could improve
more by making the syntax more "alien" and very unlinkely to be picked
up by the standard, but at a usability cost for Postgres. I'm open to
suggestions.
It improves upon another question raised by Álvaro: does the user have
to clean up the constraint? In v1 I felt it was better to have the user
remove it after the migration. Since here it's explicitly mentioned as
the constraint to use to migrate the column, I think it's OK to remove
it. We are conceptually moving it from being a constraint to being the
new default expression.
The implementation should also be simpler, since there will never be a
table rewrite.
Any thoughts about this?
Best regards,
Alberto
--
Alberto Piai
Sensational AG
Zürich, Switzerland
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Nathan Bossart | 2026-05-27 17:43:58 | Re: remove pg_spin_delay() from atomics code |
| Previous Message | lin teletele | 2026-05-27 17:08:46 | Re: Use pg_current_xact_id() instead of deprecated txid_current() |