| From: | Laurenz Albe <laurenz(dot)albe(at)cybertec(dot)at> |
|---|---|
| To: | Alberto Piai <alberto(dot)piai(at)gmail(dot)com>, pgsql-hackers(at)postgresql(dot)org |
| Subject: | Re: Adding a stored generated column without long-lived locks |
| Date: | 2026-05-26 15:23:31 |
| Message-ID: | 1514ac90bc374f750df34fe465caadd77121b067.camel@cybertec.at |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, 2026-05-14 at 15:46 -0700, Alberto Piai wrote:
> > > On Tue Mar 17, 2026 at 5:31 PM +07, Alberto Piai wrote:
> > >
> > > > I recently needed to add a stored generated column to a table of
> > > > nontrivial size, and realized that currently there is no way to do
> > > > that without rewriting the table under an AccessExclusiveLock.
>
> The attached v4 is a rebase against current master
I understand the need that the patch fulfills, and I agree that it would be a
nice feature.
I have a few thoughts about this that don't concern the implementation:
1) The SQL standard knows ALTER TABLE ... ADD ... GENERATED ALWAYS AS (...)
and ALTER TABLE ... ALTER ... DROP EXPRESSION, but there is no provision
for ALTER TABLE ... ADD GENERATED ALWAYS AS (...).
So this patch adds non-standard syntax that may one day conflict with
a new version of the standard. I think we can still do it, and the
proposed syntax looks right, but I thought I should mention it.
2) We currently have ALTER TABLE ... ALTER ... SET EXPRESSION AS (...) to
change the generation expression of a column. This command always
rewrites the table, according to the documentation.
I think that if the present patch adds support to skip rewriting the table
when a generation expression is added and the expression matches a check
constraint, changing the generation expression should also be possible
without a rewrite. If not, I would consider that a violation of the
principle of least astonishment.
Would it be difficult to extend the patch to support that?
3) We already have a couple of tricks to avoid blocking for a long time:
- ALTER TABLE ... ALTER ... SET NOT NULL can skip the table scan if there
is a check constraint that makes sure that the column is NOT NULL
- ALTER TABLE ... ATTACH PARTITION can skip the scan of the new partition
if there is a check constraint matching the partition constraint
It would be great to document these little tricks in the documentation,
probably on the ALTER TABLE page. This is not necessarily the job of
this patch, but it would also not be off-topic for the patch.
Comments on the patch:
----------------------
The patch applies and builds cleanly and passes the regression tests.
Missing parts:
- There is no documentation. At least ALTER TABLE needs a description of the
new syntax, and would ideally mention the trick with the check constraint.
- There should be support for command line completion for the new syntax.
Bugs:
- The patch doesn't test if the column is an identity column:
CREATE TABLE tab (id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY);
INSERT INTO tab VALUES (DEFAULT);
ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
The ALTER TABLE should fail, but doesn't.
- Strange behavior with sequences owned by the column:
CREATE TABLE tab (id bigserial);
INSERT INTO tab VALUES (DEFAULT);
ALTER TABLE tab ALTER id ADD GENERATED ALWAYS AS (1) STORED;
\ds tab_id_seq
List of sequences
Schema | Name | Type | Owner
--------+------------+----------+----------
public | tab_id_seq | sequence | postgres
(1 row)
I think that any sequence owned by the column should be dropped.
Alternatively, you could throw an error.
- Incorrect handling of NULL values:
CREATE TABLE tab (col1 integer, col2 integer);
INSERT INTO tab VALUES (2, NULL);
-- works, because NULL results from the check are accepted
ALTER TABLE tab ADD CHECK (col2 = col1);
SELECT pg_relation_filenode('tab');
pg_relation_filenode
----------------------
19920
(1 row)
ALTER TABLE tab ALTER col2 ADD GENERATED ALWAYS AS (col1) STORED;
SELECT pg_relation_filenode('tab');
pg_relation_filenode
----------------------
19920
(1 row)
TABLE tab;
col1 | col2
------+------
2 | ∅
(1 row)
I am not sure what the correct approach would be. The simple approach would be
to only skip the rewrite if the column has a NOT NULL constraint or an equivalent
check constraint, but perhaps you can think of a way to do better.
Comments on the code:
> --- a/src/backend/commands/tablecmds.c
> +++ b/src/backend/commands/tablecmds.c
> @@ -5093,6 +5102,13 @@ ATPrepCmd(List **wqueue, Relation rel, AlterTableCmd *cmd,
> ATSimpleRecursion(wqueue, rel, cmd, recurse, lockmode, context);
> pass = AT_PASS_SET_EXPRESSION;
> break;
> + case AT_AddGeneratedAsExprStored:
You should add a comment, same as for the other branches.
> @@ -6695,6 +6717,8 @@ alter_table_type_to_string(AlterTableType cmdtype)
> return "ALTER COLUMN ... SET NOT NULL";
> case AT_SetExpression:
> return "ALTER COLUMN ... SET EXPRESSION";
> + case AT_AddGeneratedAsExprStored:
> + return "ALTER COLUMN ... ADD GENERATED ALWAYS AS (...) STORED";
Keep it short, like "ALTER COLUMN ... ADD GENERATED".
> --- a/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
> +++ b/src/test/modules/test_ddl_deparse/test_ddl_deparse.c
> @@ -129,6 +129,9 @@ get_altertable_subcmdinfo(PG_FUNCTION_ARGS)
> case AT_SetNotNull:
> strtype = "SET NOT NULL";
> break;
> + case AT_AddGeneratedAsExprStored:
> + strtype = "ADD GENERATED ALWAYS AS (...) STORED";
> + break;
I suggest "ALTER COLUMN ADD GENERATED ALWAYS AS", but I won't insist.
Yours,
Laurenz Albe
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Alvaro Herrera | 2026-05-26 15:26:20 | Re: [PATCH] Fix REPACK decoding worker not cleaned up on FATAL exit |
| Previous Message | Fujii Masao | 2026-05-26 15:01:38 | Re: Deadlock detector fails to activate on a hot standby replica |