Re: CREATE OR REPLACE MATERIALIZED VIEW

From: Erik Wienhold <ewie(at)ewie(dot)name>
To: Paul A Jungwirth <pj(at)illuminatedcomputing(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Said Assemlal <sassemlal(at)neurorx(dot)com>, pgsql-hackers(at)postgresql(dot)org, Haibo Yan <haibo(dot)yan(at)hotmail(dot)com>
Subject: Re: CREATE OR REPLACE MATERIALIZED VIEW
Date: 2026-06-28 18:14:11
Message-ID: 6d710ac3-3a66-4b31-b35a-0b4d9afdda42@ewie.name
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


On 2026-02-03 01:19 +0100, Paul A Jungwirth wrote:
> Yan Haibo and I reviewed this patch for the Patch Review Workshop.

Thank you both for your review. Sorry about my late reply but it took
me some time to address your points with the attached v8 and I only had
so little time available.

> The patch applies, compiles, and passes tests.
>
> From v7-0001-Add-OR-REPLACE-option-to-CREATE-MATERIALIZED-VIEW.patch:
>
> ```
> diff --git a/doc/src/sgml/ref/create_materialized_view.sgml
> b/doc/src/sgml/ref/create_materialized_view.sgml
> index 62d897931c3..5e03320eb73 100644
> --- a/doc/src/sgml/ref/create_materialized_view.sgml
> +++ b/doc/src/sgml/ref/create_materialized_view.sgml
> @@ -67,7 +78,7 @@ CREATE MATERIALIZED VIEW [ IF NOT EXISTS ]
> <replaceable>table_name</replaceable>
> Do not throw an error if a materialized view with the same name already
> exists. A notice is issued in this case. Note that there is no guarantee
> that the existing materialized view is anything like the one that would
> - have been created.
> + have been created, unless you use <literal>OR REPLACE</literal> instead.
> </para>
> </listitem>
> </varlistentry>
> ```
>
> "Unless" flows strangely here, at first connoting that it will work
> with IF NOT EXISTS, but then "instead" retracts that suggestion.
> Perhaps "unlike when using OR REPLACE".

Done.

> ```
> static ObjectAddress
> create_ctas_internal(List *attrList, IntoClause *into)
> {
> - CreateStmt *create = makeNode(CreateStmt);
> - bool is_matview;
> + bool is_matview,
> + replace = false;
> char relkind;
> - Datum toast_options;
> - const char *const validnsps[] = HEAP_RELOPT_NAMESPACES;
> + Oid matviewOid = InvalidOid;
> ObjectAddress intoRelationAddr;
>
> /* This code supports both CREATE TABLE AS and CREATE MATERIALIZED VIEW */
> is_matview = (into->viewQuery != NULL);
> relkind = is_matview ? RELKIND_MATVIEW : RELKIND_RELATION;
>
> - /*
> - * Create the target relation by faking up a CREATE TABLE parsetree and
> - * passing it to DefineRelation.
> - */
> - create->relation = into->rel;
> - create->tableElts = attrList;
> - create->inhRelations = NIL;
> - create->ofTypename = NULL;
> - create->constraints = NIL;
> - create->options = into->options;
> - create->oncommit = into->onCommit;
> - create->tablespacename = into->tableSpaceName;
> - create->if_not_exists = false;
> - create->accessMethod = into->accessMethod;
> ```
>
> It seems like there is very little shared in this function between the
> old code and new. Should replacing a matview just have a different
> function altogether?

I factored out the replace logic into create_ctas_replace leaving the
existing create_ctas_internal now untouched. create_ctas_replace is
called from create_ctas_nodata where we check whether we're replacing an
existing matview.

> ```
> + /* Check if an existing materialized view needs to be replaced. */
> + if (is_matview)
> + {
> + LOCKMODE lockmode;
>
> - /*
> - * Create the relation. (This will error out if there's an existing view,
> - * so we don't need more code to complain if "replace" is false.)
> - */
> - intoRelationAddr = DefineRelation(create, relkind, InvalidOid, NULL, NULL);
> + lockmode = into->replace ? AccessExclusiveLock : NoLock;
> + (void) RangeVarGetAndCheckCreationNamespace(into->rel, lockmode,
> + &matviewOid);
> + replace = OidIsValid(matviewOid) && into->replace;
> + }
> ```
>
> We didn't call RangeVarGetAndCheckCreationNamespace before, but now we
> call it whether you say `OR REPLACE` or not. That seems suspicious.
> Why do we need to call it even when you don't ask to replace an
> existing matview? We only use matviewOid if we're replacing, and we
> only lock the relation if we're replacing. So probably this should be
> `if (is_matview && into->replace)` (which makes this function seem
> even more strongly that it should be split into two).

RangeVarGetAndCheckCreationNamespace is now called in create_ctas_nodata
when is_matview && into->replace is true.

> ```
> + rel = relation_open(matviewOid, NoLock);
> ```
>
> The stanzas in this branch need comments. The code strongly resembles
> DefineVirtualRelation, but without the comments. For instance in this
> line, I wasn't sure why were opening the relation with no lock. This
> comment from DefineVirtualRelation is very helpful:
>
> ```
> /* Relation is already locked, but we must build a relcache entry. */`
> ```
>
> The other lost comments would be helpful too.

I added these comments now.

> ```
> + /* add new attributes */
> + if (list_length(attrList) > rel->rd_att->natts)
> + {
> + ListCell *c;
> + int skip = rel->rd_att->natts;
> +
> + foreach(c, attrList)
> + {
> + if (skip > 0)
> + {
> + skip--;
> + continue;
> + }
> + atcmd = makeNode(AlterTableCmd);
> + atcmd->subtype = AT_AddColumnToView;
> + atcmd->def = (Node *) lfirst(c);
> + atcmds = lappend(atcmds, atcmd);
> + }
> + }
> ```
>
> Can we use list_nth_cell to start at the right position and avoid the
> manual skipping? (So the loop would be for instead of foreach.) Or
> even better, for_each_from looks perfect for this.

Now simplified with for_each_from.

> ```
> + /* access method */
> + atcmd = makeNode(AlterTableCmd);
> + atcmd->subtype = AT_SetAccessMethod;
> + atcmd->name = into->accessMethod ? into->accessMethod :
> default_table_access_method;
> + atcmds = lappend(atcmds, atcmd);
> ```
>
> Just to confirm, this is a noop if the access method is already the
> same? And likewise with the other AT subcommands?

It's a noop for access method and tablespace, but not for storage
options.

AT_SetAccessMethod: Checked by ATPrepSetAccessMethod to skip phase 3

AT_SetTableSpace: Checked by ATExecSetTableSpace using
CheckRelationTableSpaceMove in phase 3

AT_ReplaceRelOptions: Always replaces the current storage options
(ATExecSetRelOptions with transformRelOptions) in phase 2 regardless of
whether there are changes.

> ```
> @@ -234,7 +342,26 @@ ExecCreateTableAs(ParseState *pstate,
> CreateTableAsStmt *stmt,
>
> /* Check if the relation exists or not */
> if (CreateTableAsRelExists(stmt))
> + {
> + /* An existing materialized view can be replaced. */
> + if (is_matview && into->replace)
> + {
> + RefreshMatViewStmt *refresh;
> +
> + /* Change the relation to match the new query and other options. */
> + (void) create_ctas_nodata(query->targetList, into);
> +
> + /* Refresh the materialized view with a fake statement. */
> + refresh = makeNode(RefreshMatViewStmt);
> + refresh->relation = into->rel;
> + refresh->skipData = into->skipData;
> + refresh->concurrent = false;
> +
> + return ExecRefreshMatView(refresh, pstate->p_sourcetext, qc);
> + }
> +
> return InvalidObjectAddress;
> + }
> ```
>
> Maybe an `else` for the other case? Having parallel indentation
> communicates the parallel execution.

Done.

> ```
> @@ -402,14 +529,15 @@ CreateTableAsRelExists(CreateTableAsStmt *ctas)
> oldrelid = get_relname_relid(into->rel->relname, nspid);
> if (OidIsValid(oldrelid))
> {
> - if (!ctas->if_not_exists)
> + if (!ctas->if_not_exists && !into->replace)
> ```
>
> Changing the contract without changing the function comment seems wrong.
>
> But is this really factored correctly? Maybe the check should happen
> in the caller instead.
>
> Does this require a change to ExplainOneUtility, which also calls this
> function? At least we probably need to handle OR REPLACE there.

I added the static CreateTableAsRelReplaceable to leave the existing
CreateTableAsRelExists untouched. The former now only checks whether
the relation exists and can be replaced. The logic in ExecCreateTableAs
is changed to first check is_matview && into->replace.

> ```
> if (newdesc->natts < olddesc->natts)
> - ereport(ERROR,
> - (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> - errmsg("cannot drop columns from view")));
> + {
> + if (is_matview)
> + ereport(ERROR,
> + errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> + errmsg("cannot drop columns from materialized view"));
> + else
> + ereport(ERROR,
> + (errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> + errmsg("cannot drop columns from view")));
> + }
> ```
>
> Instead of repeating so much, perhaps `errmsg("cannot drop columns
> from %s", is_matview ? "materialized view" : "view")`? Or since that
> is probably bad for translators, at least this:
>
> ```
> errmsg(is_matview
> ? "cannot drop columns from materialized view"
> : "cannot drop columns from view")
> ```
>
> Likewise with more error messages in this function. (There are a lot of them.)

When formatting it like errmsg(is_matview ? "" : "") the PO files only
pick up the first of the two messages. Pulling out the ternary like
this would work:

is_matview ? errmsg(...) : errmsg(...)

But instead I added function checkMatviewColumns to matview.c based on
checkViewColumns to avoid the is_matview flag altogether. Both
functions still implement the same checks. The very last comment in
checkViewColumns mentions leaving column defaults in place. Since
default values can't be set on matview columns I figured that it's best
to provide these checks as separate functions to avoid confusion.

> The parser changes seem fine (see below re the third patch though).
>
> Tab completion seems fine.
>
> From v7-0002-Handle-default-tablespace-in-AlterTableInternal.patch:
>
> These changes should get rolled into the previous patch. I think you
> broke them out to allow considering them separately, although your
> patches would be simpler if they were the first patch in the series
> then.

Yes, I thought separate patches would be easier to review. v8 is now a
single patch.

> ```
> + /* use empty string to specify default tablespace */
> + atcmd->name = into->tableSpaceName ? into->tableSpaceName : "";
> atcmds = lappend(atcmds, atcmd);
> ```
>
> We didn't love using "" as a magic string to signal the
> AT_SetTableSpace subcommand. At least it should be a named constant,
> but maybe a separate struct member is better.

I changed the logic that has been factored out into create_ctas_replace
to now resolve the name of the default tablespace and store that in
atcmd->name.

> ```
> - /* Check that the tablespace exists */
> - tablespaceId = get_tablespace_oid(tablespacename, false);
> + if (tablespacename != NULL && tablespacename[0] == '\0')
> + {
> + /* Use default tablespace if name is empty string */
> + tablespaceId =
> GetDefaultTablespace(rel->rd_rel->relpersistence,
> rel->rd_rel->relispartition);
> + if (!OidIsValid(tablespaceId))
> + tablespaceId = MyDatabaseTableSpace;
> + }
> + else
> + {
> + /* Check that the tablespace exists */
> + tablespaceId = get_tablespace_oid(tablespacename, false);
> + }
>
> /* Check permissions except when moving to database's default */
> if (OidIsValid(tablespaceId) && tablespaceId != MyDatabaseTableSpace)
> ```
>
> This seems like something that should happen during analysis, but I
> know we are less strict for utility statements.
>
> Are there any locking issues with calling GetDefaultTablespace here
> instead of earlier? Why does ATPrepSetTableSpace take a lockmode but
> not use it? Can the table space get dropped in the middle of the
> command? But this is probably handled already. Do we need the lock the
> *old* tablespace?

In case of ALTER TABLE SET TABLESPACE the old and new tablespaces aren't
locked either as far as I can see. Maybe I'm missing something in
tablecmds.c because I would expect something similar to
RangeVarGetAndCheckCreationNamespace taking an AccessShareLock on the
namespace.

ATPrepSetTableSpace takes the lockmode parameter since 2dbbda02e7e which
added lockmode to other ATPrepXXX functions as well without using it. I
guess this was done to provide a consistent API across the various ALTER
TABLE subcommands.

> What about partitions?

Table partitions do not apply to matviews.

> From v7-0003-Add-WITH-OLD-DATA-to-CREATE-OR-REPLACE-MATERIALIZ.patch:
>
> ```
> @@ -383,6 +391,9 @@ ExecCreateTableAs(ParseState *pstate,
> CreateTableAsStmt *stmt,
> */
> if (is_matview)
> {
> + if (into->keepData)
> + elog(ERROR, "must not specify WITH OLD DATA when creating
> a new materialized view");
> +
> do_refresh = !into->skipData;
> into->skipData = true;
> }
> ```
>
> Should this be ereport? I guess the current parser productions should
> prevent us from reaching this code though. So perhaps a comment?

Right. Replaced with ereport and a localizable error message.

> ```
> diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
> index 7aaf0e37ad8..e51ce2701be 100644
> --- a/src/backend/parser/gram.y
> +++ b/src/backend/parser/gram.y
> @@ -4944,6 +4944,22 @@ CreateMatViewStmt:
> $7->replace = true;
> $$ = (Node *) ctas;
> }
> + | CREATE OR REPLACE OptNoLog MATERIALIZED VIEW
> create_mv_target AS SelectStmt WITH OLD DATA_P
> + {
> + CreateTableAsStmt *ctas = makeNode(CreateTableAsStmt);
> +
> + ctas->query = $9;
> + ctas->into = $7;
> + ctas->objtype = OBJECT_MATVIEW;
> + ctas->is_select_into = false;
> + ctas->if_not_exists = false;
> + /* cram additional flags into the IntoClause */
> + $7->rel->relpersistence = $4;
> + $7->skipData = false;
> + $7->keepData = true;
> + $7->replace = true;
> + $$ = (Node *) ctas;
> + }
> ;
> ```
>
> This is the fourth production for minor variations of
> CreateMatViewStmt, which seems like a lot. Perhaps instead of
> opt_with_data we add a new production opt_with_no_or_old_data that can
> return 3 alteratives (WITH [{NO|OLD}] DATA)?

I've added production opt_with_no_or_old_data as suggested, backed by
the new enum WithDataOption in parsenodes.h. With that, I replaced
.keepData from v7 with just .data holding that new enum.

Field .skipData is still in place but I think it should be migrated to
that new .data field. I haven't done that in v8, though, to keep the
patch focused.

> ```
> +-- replace query but keep old data
> +CREATE OR REPLACE MATERIALIZED VIEW mvtest_replace AS
> + SELECT 5 AS a
> + WITH OLD DATA;
> +SELECT * FROM mvtest_replace;
> +REFRESH MATERIALIZED VIEW mvtest_replace;
> +SELECT * FROM mvtest_replace;
> +
> ```
>
> We would like to see a test adding a new column while using WITH OLD
> DATA. It is safe because every heap tuple knows how many attributes it
> holds (and I tested it), but I did wonder if anyone had been tempted
> to call fastgetattr directly for matviews. (Before your feature, it
> would have been safe, I think.) Here is the test I did:
>
> ```
> create materialized view mvnulls as
> select 1 c1, 2 c2, 3 c3, 4 c4, 5 c5, 6 c6, 7 c7, null c8;
> -- add a ninth column (exceeding the old t_bits):
> create or replace materialized view mvnulls as
> select 1 c1, 2 c2, 3 c3, 4 c4, 5 c5, 6 c6, 7 c7, null c8, null c9
> with old data;
> select c9 from mvnulls;
> ```

That test case is now included.

> Another thing to think about and test: what if while using WITH OLD
> DATA the new matview definition violates a unique index? Or a domain?
> Are there any other constraints that can be applied to matviews to
> consider?

Adding a column of a domain type with NOT NULL constraint can indeed
cause WITH OLD DATA to fail since we're setting new columns to NULL in
that case. I've added the following test case:

CREATE DOMAIN mvtest_dom AS int
CONSTRAINT mvtest_dom_nn NOT NULL;
CREATE MATERIALIZED VIEW mvtest_replace AS
SELECT 1::mvtest_dom AS a;
CREATE OR REPLACE MATERIALIZED VIEW mvtest_replace AS
SELECT 2::mvtest_dom AS a, 3::mvtest_dom AS b
WITH OLD DATA;

But besides that I don't see how keeping the old data could violate a
unique index or any other constraint on a domain type. During the
execution of the matview replacement the unique index and domain types
remain unaltered. So the old data will still satisfy any pre-existing
constraints.

And changing the type of an existing column is not possible. So you
can't switch to a different domain type with more restrictive
constraints while keeping the old data. For example, this won't work
since column "a" can't be changed from mvtest_dom1 to mvtest_dom2 on
principle:

CREATE DOMAIN mvtest_dom1 AS int
CONSTRAINT mvtest_dom1_check CHECK (VALUE = 1);
CREATE DOMAIN mvtest_dom2 AS int
CONSTRAINT mvtest_dom2_check CHECK (VALUE = 2);
CREATE MATERIALIZED VIEW mvtest_replace AS
SELECT 1::mvtest_dom1 AS a;
CREATE OR REPLACE MATERIALIZED VIEW mvtest_replace AS
SELECT 2::mvtest_dom2 AS a
WITH OLD DATA;

--
Erik Wienhold

Attachment Content-Type Size
v8-0001-Add-OR-REPLACE-option-to-CREATE-MATERIALIZED-VIEW.patch text/plain 38.5 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Erik Wienhold 2026-06-28 18:32:27 Re: CREATE OR REPLACE MATERIALIZED VIEW
Previous Message Haibo Yan 2026-06-28 17:58:23 Re: [PATCH] DISTINCT in plain aggregate window functions