Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY

From: surya poondla <suryapoondla4(at)gmail(dot)com>
To: cca5507 <cca5507(at)qq(dot)com>
Cc: Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Giuliano Gagliardi <gogi(at)gogi(dot)tv>, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY
Date: 2026-03-25 22:45:11
Message-ID: CAOVWO5o46gqX_=BkrzuTRyPijDBWaYAGFVqn7UXrww_4owhJ0w@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hi ChangAo,

Thank you for the continued review, your test case helped me find a
real issue in v3 patch, and I updated the fix accordingly.

You were correct that v3 was wrong. Removing IS NOT NULL and using *= alone
in the pre-check was too aggressive: *= treats NULL=NULL=true, which caused
(NULL,NULL)×2 with a unique index on a nullable column to be flagged
as duplicates even though the unique index allows multiple NULLs.

The updated patch takes a different approach. Instead of removing the guard
and relying on *= alone, the pre-check now uses the same per-column equality
operators as the FULL OUTER JOIN accumulated during the same
index-scanning loop. These operators treat NULL=NULL as false, which is
consistent with how
unique indexes actually work (NULLs are distinct).

For (test, NULL)×2 with a single index on a (non-null):
newdata2.a = newdata.a i.e 'test'='test' that is TRUE
newdata2.* *= newdata.* that is TRUE
Thus the duplicate is caught and error is raised

For (NULL, NULL)×2 with a single index on a (nullable):
newdata2.a = newdata.a i.e NULL=NULL that is NULL (false)
Thus no duplicate is caught, the refresh correctly succeeds, MV gets 2 rows

For (test, NULL)×2 with a composite index on (a, b):
newdata2.a = newdata.a i.e TRUE
newdata2.b = newdata.b i.e NULL=NULL that is NULL (false)
Combined: NULL is not caught, refresh correctly succeeds

Regarding your record_image_eq_variant approach: it correctly handles the
NULL-in-indexed-column case, but it introduces a performance regression for
unchanged rows where any non-indexed column contains NULL.
For example, an unchanged row (1, NULL) with a unique index on non-null
index on a would require a DELETE+INSERT on every CONCURRENTLY refresh
because
record_image_eq_variant((1,NULL),(1,NULL)) returns false.
This makes CONCURRENTLY impractical for any table where rows contain NULLs
in non-indexed columns.

The updated patch for bug 1.

Here is some additional tests I did:
postgres=# SET client_min_messages = WARNING;
SET
postgres=# -- Test 1: (test,NULL)×2, single index on 'a' should ERROR
postgres=# CREATE TABLE t(a text, b text);
INSERT INTO t VALUES('test', NULL);
CREATE MATERIALIZED VCREATE TABLE
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
CREATE UNIQUE INDEX ON m(a);
INSERT INTO t VALUES('test', NULL);
REFSELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must error
ERROR: new data for materialized view "m" contains duplicate rows
DETAIL: Row: (test,)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# -- Test 2: (NULL,NULL)×2, single index on 'a' should SUCCEED
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(NULL, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# INSERT INTO t VALUES(NULL, NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
SELECT COUNT(*) FROM m;
DROP TABLE t CASCADE;REFRESH MATERIALIZED VIEW
postgres=# SELECT COUNT(*) FROM m; -- should be 2
count
-------
2
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# --Test 3: (test,NULL)×2, composite index (a,b) should SUCCEED
postgres=# CREATE TABLE t(a text, b text);
CREATE TABLE
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a, b);
INSERT INTO t VALUES('test', NUL CREATE UNIQUE INDEX ON m(a, b);
CREATE INDEX
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
REFRESH MATERIALIZED VIEW
postgres=# SELECT COUNT(*) FROM m;
DROP TABLE t CASCADE;
SELECT COUNT(*) FROM m; -- must be 2
count
-------
2
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# -- Test 4: unchanged (1,NULL), index on 'a' should SUCCEED.
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
REFRESH MATERIALIZED VIEW
postgres=# SELECT * FROM m; -- must still show (1,)
a | b
---+---
1 |
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE

postgres=# -- Test 5: (1,NULL)×2, separate index on a AND b, should ERROR
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# CREATE UNIQUE INDEX ON m(b);
CREATE INDEX
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must error
ERROR: duplicate key value violates unique constraint "m_a_idx"
DETAIL: Key (a)=(1) already exists.
CONTEXT: SQL statement "INSERT INTO public.m SELECT (diff.newdata).* FROM
pg_temp_2.pg_temp_16535_2 diff WHERE tid IS NULL"
postgres=# DROP TABLE t CASCADE;
DROP TABLE
postgres=#

Made some minor changes to bug2 patch too.

Regards,
Surya Poondla

Attachment Content-Type Size
0005-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-to-detect_bug1.patch application/octet-stream 11.4 KB
0005-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-performance_bug2.patch application/octet-stream 10.0 KB

In response to

Browse pgsql-bugs by date

  From Date Subject
Next Message Andrei Lepikhov 2026-03-26 19:59:13 Re: BUG #19435: Error: "No relation entry for relid 2" Triggered by Complex Join with Self-Referencing Tables
Previous Message Dmitriy Kuzmin 2026-03-25 14:30:38 Re: BUG #19437: temp_tablespaces doesn't work inside a cursor?