Re: Implementing Incremental View Maintenance

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: pgsql-hackers(at)postgresql(dot)org
Cc: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, michael(at)paquier(dot)xyz, amitlangote09(at)gmail(dot)com, Takuma Hoshiai <hoshiai(at)sraoss(dot)co(dot)jp>, alvherre(at)2ndquadrant(dot)com, thomas(dot)munro(at)gmail(dot)com, kgrittn(at)gmail(dot)com
Subject: Re: Implementing Incremental View Maintenance
Date: 2019-12-20 05:02:32
Message-ID: 20191220140232.57b43c4f6296f793d42946b2@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Attached is the latest patch (v10) to add support for Incremental
Materialized View Maintenance (IVM).

IVM is a way to make materialized views up-to-date in which only
incremental changes are computed and applied on views rather than
recomputing the contents from scratch as REFRESH MATERIALIZED VIEW
does. IVM can update materialized views more efficiently
than recomputation when only small part of the view need updates.

There are two approaches with regard to timing of view maintenance:
immediate and deferred. In immediate maintenance, views are updated in
the same transaction where its base table is modified. In deferred
maintenance, views are updated after the transaction is committed,
for example, when the view is accessed, as a response to user command
like REFRESH, or periodically in background, and so on.

This patch implements a kind of immediate maintenance, in which
materialized views are updated immediately in AFTER triggers when a
base table is modified.

This supports views using:
- inner and outer joins including self-join
- some built-in aggregate functions (count, sum, agv, min, max)
- a part of subqueries
-- simple subqueries in FROM clause
-- EXISTS subqueries in WHERE clause
- DISTINCT and views with tuple duplicates

===
Here are major changes we made after the previous submitted patch:

* Aggregate functions are checked if they can be used in IVM
using their OID. Per comments from Alvaro Herrera.

For this purpose, Gen_fmgrtab.pl was modified so that OIDs of
aggregate functions are output to fmgroids.h.

* Some bug fixes including:

- Mistake of tab-completion of psql pointed out by nuko-san
- A bug relating rename of matview pointed out by nuko-san
- spelling errors
- etc.

* Add documentations for IVM

* Patch is splited into eleven parts to make review easier
as suggested by Amit Langote:

- 0001: Add a new syntax:
CREATE INCREMENTAL MATERIALIZED VIEW
- 0002: Add a new column relisivm to pg_class
- 0003: Change trigger.c to allow to prolong life span of tupestores
containing Transition Tables generated via AFTER trigger
- 0004: Add the basic IVM future using counting algorithm:
This supports inner joins, DISTINCT, and tuple duplicates.
- 0005: Change GEN_fmgrtab.pl to output aggregate function's OIDs
- 0006: Add aggregates support for IVM
- 0007: Add subqueries support for IVM
- 0008: Add outer joins support for IVM
- 0009: Add IVM support to psql command
- 0010: Add regression tests for IVM
- 0011: Add documentations for IVM

===
Todo:

Currently, REFRESH and pg_dump/pg_restore is not supported, but
we are working on them.

Also, TRUNCATE is not supported. When TRUNCATE command is executed
on a base table, nothing occurs on materialized views. We are
now considering another better options, like:

- Raise an error or warning when a base table is TRUNCATEed.
- Make the view non-scannable (like WITH NO DATA)
- Update the view in some ways. It would be easy for inner joins
or aggregate views, but there is some difficult with outer joins.

Regards,
Yugo Nagata

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
IVM_patches_v10.tar.gz application/gzip 65.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrey Borodin 2019-12-20 05:03:57 Disallow cancellation of waiting for synchronous replication
Previous Message Kyotaro Horiguchi 2019-12-20 04:41:25 Re: Fetching timeline during recovery