| From: | surya poondla <suryapoondla4(at)gmail(dot)com> |
|---|---|
| To: | cca5507 <cca5507(at)qq(dot)com> |
| Cc: | Rafia Sabih <rafia(dot)pghackers(at)gmail(dot)com>, Giuliano Gagliardi <gogi(at)gogi(dot)tv>, pgsql-bugs <pgsql-bugs(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: Two issues with REFRESH MATERIALIZED VIEW CONCURRENTLY |
| Date: | 2026-03-25 22:45:11 |
| Message-ID: | CAOVWO5o46gqX_=BkrzuTRyPijDBWaYAGFVqn7UXrww_4owhJ0w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-bugs |
Hi ChangAo,
Thank you for the continued review, your test case helped me find a
real issue in v3 patch, and I updated the fix accordingly.
You were correct that v3 was wrong. Removing IS NOT NULL and using *= alone
in the pre-check was too aggressive: *= treats NULL=NULL=true, which caused
(NULL,NULL)×2 with a unique index on a nullable column to be flagged
as duplicates even though the unique index allows multiple NULLs.
The updated patch takes a different approach. Instead of removing the guard
and relying on *= alone, the pre-check now uses the same per-column equality
operators as the FULL OUTER JOIN accumulated during the same
index-scanning loop. These operators treat NULL=NULL as false, which is
consistent with how
unique indexes actually work (NULLs are distinct).
For (test, NULL)×2 with a single index on a (non-null):
newdata2.a = newdata.a i.e 'test'='test' that is TRUE
newdata2.* *= newdata.* that is TRUE
Thus the duplicate is caught and error is raised
For (NULL, NULL)×2 with a single index on a (nullable):
newdata2.a = newdata.a i.e NULL=NULL that is NULL (false)
Thus no duplicate is caught, the refresh correctly succeeds, MV gets 2 rows
For (test, NULL)×2 with a composite index on (a, b):
newdata2.a = newdata.a i.e TRUE
newdata2.b = newdata.b i.e NULL=NULL that is NULL (false)
Combined: NULL is not caught, refresh correctly succeeds
Regarding your record_image_eq_variant approach: it correctly handles the
NULL-in-indexed-column case, but it introduces a performance regression for
unchanged rows where any non-indexed column contains NULL.
For example, an unchanged row (1, NULL) with a unique index on non-null
index on a would require a DELETE+INSERT on every CONCURRENTLY refresh
because
record_image_eq_variant((1,NULL),(1,NULL)) returns false.
This makes CONCURRENTLY impractical for any table where rows contain NULLs
in non-indexed columns.
The updated patch for bug 1.
Here is some additional tests I did:
postgres=# SET client_min_messages = WARNING;
SET
postgres=# -- Test 1: (test,NULL)×2, single index on 'a' should ERROR
postgres=# CREATE TABLE t(a text, b text);
INSERT INTO t VALUES('test', NULL);
CREATE MATERIALIZED VCREATE TABLE
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
CREATE UNIQUE INDEX ON m(a);
INSERT INTO t VALUES('test', NULL);
REFSELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must error
ERROR: new data for materialized view "m" contains duplicate rows
DETAIL: Row: (test,)
postgres=# DROP TABLE t CASCADE;
DROP TABLE
postgres=# -- Test 2: (NULL,NULL)×2, single index on 'a' should SUCCEED
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(NULL, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# INSERT INTO t VALUES(NULL, NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
SELECT COUNT(*) FROM m;
DROP TABLE t CASCADE;REFRESH MATERIALIZED VIEW
postgres=# SELECT COUNT(*) FROM m; -- should be 2
count
-------
2
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE
postgres=# --Test 3: (test,NULL)×2, composite index (a,b) should SUCCEED
postgres=# CREATE TABLE t(a text, b text);
CREATE TABLE
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a, b);
INSERT INTO t VALUES('test', NUL CREATE UNIQUE INDEX ON m(a, b);
CREATE INDEX
postgres=# INSERT INTO t VALUES('test', NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
REFRESH MATERIALIZED VIEW
postgres=# SELECT COUNT(*) FROM m;
DROP TABLE t CASCADE;
SELECT COUNT(*) FROM m; -- must be 2
count
-------
2
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE
postgres=# -- Test 4: unchanged (1,NULL), index on 'a' should SUCCEED.
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must succeed
REFRESH MATERIALIZED VIEW
postgres=# SELECT * FROM m; -- must still show (1,)
a | b
---+---
1 |
(1 row)
postgres=# DROP TABLE t CASCADE;
DROP TABLE
postgres=# -- Test 5: (1,NULL)×2, separate index on a AND b, should ERROR
postgres=# CREATE TABLE t(a int, b int);
CREATE TABLE
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# CREATE MATERIALIZED VIEW m AS SELECT * FROM t;
SELECT 1
postgres=# CREATE UNIQUE INDEX ON m(a);
CREATE INDEX
postgres=# CREATE UNIQUE INDEX ON m(b);
CREATE INDEX
postgres=# INSERT INTO t VALUES(1, NULL);
INSERT 0 1
postgres=# REFRESH MATERIALIZED VIEW CONCURRENTLY m; -- must error
ERROR: duplicate key value violates unique constraint "m_a_idx"
DETAIL: Key (a)=(1) already exists.
CONTEXT: SQL statement "INSERT INTO public.m SELECT (diff.newdata).* FROM
pg_temp_2.pg_temp_16535_2 diff WHERE tid IS NULL"
postgres=# DROP TABLE t CASCADE;
DROP TABLE
postgres=#
Made some minor changes to bug2 patch too.
Regards,
Surya Poondla
| Attachment | Content-Type | Size |
|---|---|---|
| 0005-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-to-detect_bug1.patch | application/octet-stream | 11.4 KB |
| 0005-Fix-REFRESH-MATERIALIZED-VIEW-CONCURRENTLY-performance_bug2.patch | application/octet-stream | 10.0 KB |
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Andrei Lepikhov | 2026-03-26 19:59:13 | Re: BUG #19435: Error: "No relation entry for relid 2" Triggered by Complex Join with Self-Referencing Tables |
| Previous Message | Dmitriy Kuzmin | 2026-03-25 14:30:38 | Re: BUG #19437: temp_tablespaces doesn't work inside a cursor? |