| From: | Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com> |
|---|---|
| To: | Erik Wienhold <ewie(at)ewie(dot)name> |
| Cc: | Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Said Assemlal <sassemlal(at)neurorx(dot)com>, pgsql-hackers(at)postgresql(dot)org, Haibo Yan <haibo(dot)yan(at)hotmail(dot)com> |
| Subject: | Re: CREATE OR REPLACE MATERIALIZED VIEW |
| Date: | 2026-02-03 00:19:34 |
| Message-ID: | CA+renyW6-7aH74o-Gv2k+6emEXwznXxg=GkmTaKhDcresQQ9bg@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
Hello,
Yan Haibo and I reviewed this patch for the Patch Review Workshop.
The patch applies, compiles, and passes tests.
From v7-0001-Add-OR-REPLACE-option-to-CREATE-MATERIALIZED-VIEW.patch:
```
diff --git a/doc/src/sgml/ref/create_materialized_view.sgml
b/doc/src/sgml/ref/create_materialized_view.sgml
index 62d897931c3..5e03320eb73 100644
--- a/doc/src/sgml/ref/create_materialized_view.sgml
+++ b/doc/src/sgml/ref/create_materialized_view.sgml
@@ -67,7 +78,7 @@ CREATE MATERIALIZED VIEW [ IF NOT EXISTS ]
<replaceable>table_name</replaceable>
Do not throw an error if a materialized view with the same name already
exists. A notice is issued in this case. Note that there is no guarantee
that the existing materialized view is anything like the one that would
- have been created.
+ have been created, unless you use <literal>OR REPLACE</literal> instead.
</para>
</listitem>
</varlistentry>
```
"Unless" flows strangely here, at first connoting that it will work
with IF NOT EXISTS, but then "instead" retracts that suggestion.
Perhaps "unlike when using OR REPLACE".
```
static ObjectAddress
create_ctas_internal(List *attrList, IntoClause *into)
{
- CreateStmt *create = makeNode(CreateStmt);
- bool is_matview;
+ bool is_matview,
+ replace = false;
char relkind;
- Datum toast_options;
- const char *const validnsps[] = HEAP_RELOPT_NAMESPACES;
+ Oid matviewOid = InvalidOid;
ObjectAddress intoRelationAddr;
/* This code supports both CREATE TABLE AS and CREATE MATERIALIZED VIEW */
is_matview = (into->viewQuery != NULL);
relkind = is_matview ? RELKIND_MATVIEW : RELKIND_RELATION;
- /*
- * Create the target relation by faking up a CREATE TABLE parsetree and
- * passing it to DefineRelation.
- */
- create->relation = into->rel;
- create->tableElts = attrList;
- create->inhRelations = NIL;
- create->ofTypename = NULL;
- create->constraints = NIL;
- create->options = into->options;
- create->oncommit = into->onCommit;
- create->tablespacename = into->tableSpaceName;
- create->if_not_exists = false;
- create->accessMethod = into->accessMethod;
```
It seems like there is very little shared in this function between the
old code and new. Should replacing a matview just have a different
function altogether?
```
+ /* Check if an existing materialized view needs to be replaced. */
+ if (is_matview)
+ {
+ LOCKMODE lockmode;
- /*
- * Create the relation. (This will error out if there's an existing view,
- * so we don't need more code to complain if "replace" is false.)
- */
- intoRelationAddr = DefineRelation(create, relkind, InvalidOid, NULL, NULL);
+ lockmode = into->replace ? AccessExclusiveLock : NoLock;
+ (void) RangeVarGetAndCheckCreationNamespace(into->rel, lockmode,
+ &matviewOid);
+ replace = OidIsValid(matviewOid) && into->replace;
+ }
```
We didn't call RangeVarGetAndCheckCreationNamespace before, but now we
call it whether you say `OR REPLACE` or not. That seems suspicious.
Why do we need to call it even when you don't ask to replace an
existing matview? We only use matviewOid if we're replacing, and we
only lock the relation if we're replacing. So probably this should be
`if (is_matview && into->replace)` (which makes this function seem
even more strongly that it should be split into two).
```
+ rel = relation_open(matviewOid, NoLock);
```
The stanzas in this branch need comments. The code strongly resembles
DefineVirtualRelation, but without the comments. For instance in this
line, I wasn't sure why were opening the relation with no lock. This
comment from DefineVirtualRelation is very helpful:
```
/* Relation is already locked, but we must build a relcache entry. */`
```
The other lost comments would be helpful too.
```
+ /* add new attributes */
+ if (list_length(attrList) > rel->rd_att->natts)
+ {
+ ListCell *c;
+ int skip = rel->rd_att->natts;
+
+ foreach(c, attrList)
+ {
+ if (skip > 0)
+ {
+ skip--;
+ continue;
+ }
+ atcmd = makeNode(AlterTableCmd);
+ atcmd->subtype = AT_AddColumnToView;
+ atcmd->def = (Node *) lfirst(c);
+ atcmds = lappend(atcmds, atcmd);
+ }
+ }
```
Can we use list_nth_cell to start at the right position and avoid the
manual skipping? (So the loop would be for instead of foreach.) Or
even better, for_each_from looks perfect for this.
```
+ /* access method */
+ atcmd = makeNode(AlterTableCmd);
+ atcmd->subtype = AT_SetAccessMethod;
+ atcmd->name = into->accessMethod ? into->accessMethod :
default_table_access_method;
+ atcmds = lappend(atcmds, atcmd);
```
Just to confirm, this is a noop if the access method is already the
same? And likewise with the other AT subcommands?
```
@@ -234,7 +342,26 @@ ExecCreateTableAs(ParseState *pstate,
CreateTableAsStmt *stmt,
/* Check if the relation exists or not */
if (CreateTableAsRelExists(stmt))
+ {
+ /* An existing materialized view can be replaced. */
+ if (is_matview && into->replace)
+ {
+ RefreshMatViewStmt *refresh;
+
+ /* Change the relation to match the new query and other options. */
+ (void) create_ctas_nodata(query->targetList, into);
+
+ /* Refresh the materialized view with a fake statement. */
+ refresh = makeNode(RefreshMatViewStmt);
+ refresh->relation = into->rel;
+ refresh->skipData = into->skipData;
+ refresh->concurrent = false;
+
+ return ExecRefreshMatView(refresh, pstate->p_sourcetext, qc);
+ }
+
return InvalidObjectAddress;
+ }
```
Maybe an `else` for the other case? Having parallel indentation
communicates the parallel execution.
```
@@ -402,14 +529,15 @@ CreateTableAsRelExists(CreateTableAsStmt *ctas)
oldrelid = get_relname_relid(into->rel->relname, nspid);
if (OidIsValid(oldrelid))
{
- if (!ctas->if_not_exists)
+ if (!ctas->if_not_exists && !into->replace)
```
Changing the contract without changing the function comment seems wrong.
But is this really factored correctly? Maybe the check should happen
in the caller instead.
Does this require a change to ExplainOneUtility, which also calls this
function? At least we probably need to handle OR REPLACE there.
```
if (newdesc->natts < olddesc->natts)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
- errmsg("cannot drop columns from view")));
+ {
+ if (is_matview)
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot drop columns from materialized view"));
+ else
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
+ errmsg("cannot drop columns from view")));
+ }
```
Instead of repeating so much, perhaps `errmsg("cannot drop columns
from %s", is_matview ? "materialized view" : "view")`? Or since that
is probably bad for translators, at least this:
```
errmsg(is_matview
? "cannot drop columns from materialized view"
: "cannot drop columns from view")
```
Likewise with more error messages in this function. (There are a lot of them.)
The parser changes seem fine (see below re the third patch though).
Tab completion seems fine.
From v7-0002-Handle-default-tablespace-in-AlterTableInternal.patch:
These changes should get rolled into the previous patch. I think you
broke them out to allow considering them separately, although your
patches would be simpler if they were the first patch in the series
then.
```
+ /* use empty string to specify default tablespace */
+ atcmd->name = into->tableSpaceName ? into->tableSpaceName : "";
atcmds = lappend(atcmds, atcmd);
```
We didn't love using "" as a magic string to signal the
AT_SetTableSpace subcommand. At least it should be a named constant,
but maybe a separate struct member is better.
```
- /* Check that the tablespace exists */
- tablespaceId = get_tablespace_oid(tablespacename, false);
+ if (tablespacename != NULL && tablespacename[0] == '\0')
+ {
+ /* Use default tablespace if name is empty string */
+ tablespaceId =
GetDefaultTablespace(rel->rd_rel->relpersistence,
rel->rd_rel->relispartition);
+ if (!OidIsValid(tablespaceId))
+ tablespaceId = MyDatabaseTableSpace;
+ }
+ else
+ {
+ /* Check that the tablespace exists */
+ tablespaceId = get_tablespace_oid(tablespacename, false);
+ }
/* Check permissions except when moving to database's default */
if (OidIsValid(tablespaceId) && tablespaceId != MyDatabaseTableSpace)
```
This seems like something that should happen during analysis, but I
know we are less strict for utility statements.
Are there any locking issues with calling GetDefaultTablespace here
instead of earlier? Why does ATPrepSetTableSpace take a lockmode but
not use it? Can the table space get dropped in the middle of the
command? But this is probably handled already. Do we need the lock the
*old* tablespace?
What about partitions?
From v7-0003-Add-WITH-OLD-DATA-to-CREATE-OR-REPLACE-MATERIALIZ.patch:
```
@@ -383,6 +391,9 @@ ExecCreateTableAs(ParseState *pstate,
CreateTableAsStmt *stmt,
*/
if (is_matview)
{
+ if (into->keepData)
+ elog(ERROR, "must not specify WITH OLD DATA when creating
a new materialized view");
+
do_refresh = !into->skipData;
into->skipData = true;
}
```
Should this be ereport? I guess the current parser productions should
prevent us from reaching this code though. So perhaps a comment?
```
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 7aaf0e37ad8..e51ce2701be 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -4944,6 +4944,22 @@ CreateMatViewStmt:
$7->replace = true;
$$ = (Node *) ctas;
}
+ | CREATE OR REPLACE OptNoLog MATERIALIZED VIEW
create_mv_target AS SelectStmt WITH OLD DATA_P
+ {
+ CreateTableAsStmt *ctas = makeNode(CreateTableAsStmt);
+
+ ctas->query = $9;
+ ctas->into = $7;
+ ctas->objtype = OBJECT_MATVIEW;
+ ctas->is_select_into = false;
+ ctas->if_not_exists = false;
+ /* cram additional flags into the IntoClause */
+ $7->rel->relpersistence = $4;
+ $7->skipData = false;
+ $7->keepData = true;
+ $7->replace = true;
+ $$ = (Node *) ctas;
+ }
;
```
This is the fourth production for minor variations of
CreateMatViewStmt, which seems like a lot. Perhaps instead of
opt_with_data we add a new production opt_with_no_or_old_data that can
return 3 alteratives (WITH [{NO|OLD}] DATA)?
```
+-- replace query but keep old data
+CREATE OR REPLACE MATERIALIZED VIEW mvtest_replace AS
+ SELECT 5 AS a
+ WITH OLD DATA;
+SELECT * FROM mvtest_replace;
+REFRESH MATERIALIZED VIEW mvtest_replace;
+SELECT * FROM mvtest_replace;
+
```
We would like to see a test adding a new column while using WITH OLD
DATA. It is safe because every heap tuple knows how many attributes it
holds (and I tested it), but I did wonder if anyone had been tempted
to call fastgetattr directly for matviews. (Before your feature, it
would have been safe, I think.) Here is the test I did:
```
create materialized view mvnulls as
select 1 c1, 2 c2, 3 c3, 4 c4, 5 c5, 6 c6, 7 c7, null c8;
-- add a ninth column (exceeding the old t_bits):
create or replace materialized view mvnulls as
select 1 c1, 2 c2, 3 c3, 4 c4, 5 c5, 6 c6, 7 c7, null c8, null c9
with old data;
select c9 from mvnulls;
```
Another thing to think about and test: what if while using WITH OLD
DATA the new matview definition violates a unique index? Or a domain?
Are there any other constraints that can be applied to matviews to
consider?
Yours,
--
Paul ~{:-)
pj(at)illuminatedcomputing(dot)com
| From | Date | Subject | |
|---|---|---|---|
| Previous Message | Tatsuo Ishii | 2026-02-03 00:00:39 | Re: Row pattern recognition |