| From: | Nico Williams <nico(at)cryptonector(dot)com> |
|---|---|
| To: | Kirk Wolak <wolakk(at)gmail(dot)com> |
| Cc: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Adam Brusselback <adambrusselback(at)gmail(dot)com>, Nikolay Samokhvalov <nik(at)postgres(dot)ai>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru> |
| Subject: | Re: [Patch] Add WHERE clause support to REFRESH MATERIALIZED VIEW |
| Date: | 2025-12-09 05:35:42 |
| Message-ID: | aTe1LiumA5sZIwxc@ubby |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Tue, Dec 09, 2025 at 12:08:58AM -0500, Kirk Wolak wrote:
> Our understanding is that many people run into this exact issue. The
> cache needs small frequent updates.
> (After reading the code that handles MVs, we just created our own TABLE,
> and maintain it with a scheduler to rebuild HOURLY,
> and when we process the file, a Simple UPDATE is issued for the one column).
>
> While this "Works", the CONCEPT of this patch (untested by me, as of
> yet), would have fixed this with far less effort,
> and would be easier to maintain.
>
> After I review the code, I will add additional comments.
>
> I am curious what others think? (And FWIW, I believe that the larger the
> MV, the MORE this feature is needed,
> vs refreshing the ENTIRE view).
What I do is I have my own materialized view infrastructure, written
entirely in PlPgSQL, and I completely avoid PG's MV support. This
alternative MV scheme creates an actual table for each MV, which means:
- one can update the MV directly (and I do, via triggers)
- one can have triggers on the MV (e.g., to record history)
This is has been very handy for me.
I also have a state table in which to keep track of whether an MV needs
a refresh, and I have a function i can use to mark an MV as needing a
refresh. Marking an MV as needing a refresh sends a NOTIFY, and then I
have a daemon that will refresh views as needed (with some
debouncing/coalescing of notifications).
This way I can have MVs with very complex underlying queries for which
some kinds of updates I can easily write fast triggers for and others
where I can't (or where they would slow down transactions too much) I
simply mark the MV as needing a refresh.
Typical MV queries I have that this works very well for include
transitive reachability closure computations (e.g., all the groups a
thing is a member of, directly and indirectly, or vice versa --
recursive CTEs basically). Though I do now have triggers that can do a
reasonably good job of synchronously and quickly updating MVs with such
queries, it's just I didn't always.
Refreshes are always 'concurrent'.
This is my 80/20 solution to the "Incremental View Maintenance" (IVM)
problem.
A not very current version is here:
https://github.com/twosigma/postgresql-contrib/blob/master/mat_views.sql
If you like it I might be able to get a newer version out. The version
above has a few minor issues:
- it uses DELETE FROM instead of TRUNCATE for its' sort-of temp tables
- using TRUNCATE ends up requiring some care to avoid occasional
deadlocks with VACUUM that are due to using tables as types of the
columns of the deltas tables
- logging -- lots of logging in the newest version
Another issue is that I rely on NATURAL FULL OUTER JOIN to avoid having
to generate ON conditions, but that means that all columns of the
underlying VIEW must not have NULLs. As I've not needed to support
nullable columns in these MVs, I don't mind.
Nico
--
| From | Date | Subject | |
|---|---|---|---|
| Next Message | Chao Li | 2025-12-09 05:35:47 | Re: [Proposal] Adding callback support for custom statistics kinds |
| Previous Message | Amit Kapila | 2025-12-09 05:34:02 | Re: Fix LOCK_TIMEOUT handling in slotsync worker |