| From: | Dharin Shah <dharinshah95(at)gmail(dot)com> |
|---|---|
| To: | Adam Brusselback <adambrusselback(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
| Subject: | Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW |
| Date: | 2026-01-15 20:50:54 |
| Message-ID: | CAOj6k6e7fw8RjAWXc04_A=sg4=jsU0CK7Qi13fwkPW5hMz6a5w@mail.gmail.com |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
>
> (repro_include_issue.sql)
Typo fix : test_include_bug.sql (attached file)
Thanks,
Dharin
On Thu, Jan 15, 2026 at 7:46 PM Dharin Shah <dharinshah95(at)gmail(dot)com> wrote:
> Hey Adam,
>
> Apologies for the delay, and as promised on discord, I did a review of the
> current patch (cf 6305) and wanted to share findings that line up with the
> thread’s design discussion, plus one additional correctness bug that I
> could reproduce.
>
> 1. In the non-concurrent REFRESH ... WHERE .... path, the UPSERT SQL is
> built using the unique index metadata. The code currently uses indnatts
> when building the ON Conflict (...) target list. That includes INCLUDE
> columns, so for an index like:
>
> CREATE UNIQUE INDEX ON mv(id) INCLUDE (extra);
> the generated statement becomes effectively ON CONFLICT (id, extra) ...,
> which fails with:
> ERROR: there is no unique or exclusion constraint matching the ON CONFLICT
> specification
>
> The fix appears straightforward: use indnkeyatts (key attributes only)
> when generating the conflict target, and also when deciding which columns
> are “key” for the UPDATE SET clause. I’ve attached a minimal repro SQL
> script (repro_include_issue.sql)
>
> 2. Another small test quality issue: the regression script has a comment
> “Subqueries -> Error” but the expected output shows no error for the
> schema-qualified subquery. There is no explicit check forbidding subqueries
> in transformRefreshWhereClause(), so schema-qualified subqueries appear
> allowed.
>
> Moving on to broader questions
>
>
>
>> I believe the issue is that the DELETE -> INSERT strategy leaves a
>> consistency gap. Since we relied on ROW EXCLUSIVE locks to allow concurrent
>> reads, the moment we delete the rows, we lose the physical lock on them. If
>> a concurrent transaction inserts a colliding row during that gap, the
>> materialized view ends up inconsistent with the base query (or hits a
>> constraint violation).
>
>
> Consistency gap in the non-concurrent mode matches what I’d expect: with
> ROW EXCLUSIVE you allow concurrent readers/writers, and a pure DELETE →
> INSERT approach can create a window where the old tuple is gone and a
> concurrent session can insert a conflicting logical row.
>
> That said, I think it would help the patch to explicitly define the
> intended safety model:
> 1. Is the goal to be safe against concurrent DML on base tables only
> (i.e., refresh sees a snapshot and updates MV accordingly), or also to be
> safe against concurrent partial refreshes and direct writes to the MV (when
> maintenance is enabled)?
> 2. Should the non-concurrent partial refresh be “best effort” like normal
> DML (user coordinates), or should it be “maintenance-like” (serialized /
> logically safe by default)?
>
> If the intent is “safe by default”, I’d encourage documenting very clearly
> what’s guaranteed, and adding regression/README-style notes for footguns
>
> From a reviewer standpoint, I think the feature concept is sound and
> valuable, but it needs a crisp statement of semantics and safety
> boundaries. The tricky part is exactly what you called out: incremental
> refresh implies concurrency questions that aren’t present with full rebuild
> + strong locks.
>
> I’m happy to keep reviewing iterations (especially around the advisory
> lock approach), and I’ll attach the reproduction scripts and notes I used.
>
> As a possible staging approach: it might be simplest to start with a
> conservative serialization model for non-concurrent WHERE (while still
> allowing readers), and then iterate toward finer-grained logical locking
> if/when needed for throughput.
>
>
> Thanks,
> Dharin
>
>
> On Sun, Jan 4, 2026 at 3:56 AM Adam Brusselback <adambrusselback(at)gmail(dot)com>
> wrote:
>
>> Hi all,
>>
>> I've been running some more concurrency tests against this patch
>> (specifically looking for race conditions), and I found a flaw in the
>> implementation for the REFRESH ... WHERE ... mode (without CONCURRENTLY).
>>
>> I believe the issue is that the DELETE -> INSERT strategy leaves a
>> consistency gap. Since we relied on ROW EXCLUSIVE locks to allow concurrent
>> reads, the moment we delete the rows, we lose the physical lock on them. If
>> a concurrent transaction inserts a colliding row during that gap, the
>> materialized view ends up inconsistent with the base query (or hits a
>> constraint violation).
>>
>> I initially was using SELECT ... FOR UPDATE to lock the rows before
>> modification, but that lock is (now that I know) obviously lost when the
>> row is deleted.
>>
>> My plan is to replace that row-locking strategy with transaction-level
>> advisory locks inside the refresh logic:
>>
>> Before the DELETE, run a SELECT pg_advisory_xact_lock(mv_oid,
>> hashtext(ROW(unique_keys)::text)) for the rows matching the WHERE clause.
>>
>> This effectively locks the "logical" ID of the row, preventing concurrent
>> refreshes on the same ID even while the physical tuple is temporarily gone.
>> Hash collisions should not have any correctness issues that I can think of.
>>
>> However, before I sink time into implementing that fix:
>>
>> Is there general interest in having REFRESH MATERIALIZED VIEW ... WHERE
>> ... in core?
>> If the community feels this feature is a footgun or conceptually wrong
>> for Postgres, I'd rather know now before spending more time on this.
>>
>> If the feature concept is sound, does the advisory lock approach seem
>> like the right way to handle the concurrency safety here?
>>
>> Thanks,
>> Adam Brusselback
>>
>
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Sami Imseih | 2026-01-15 20:53:20 | Re: [[BUG] pg_stat_statements crashes with var and non-var expressions in IN clause |
| Previous Message | Corey Huinker | 2026-01-15 20:21:23 | Re: Extended Statistics set/restore/clear functions. |