Re: Implementing Incremental View Maintenance

From: Zhihong Yu <zyu(at)yugabyte(dot)com>
To: Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrew Dunstan <andrew(at)dunslane(dot)net>, Andy Fan <zhihui(dot)fan1213(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Subject: Re: Implementing Incremental View Maintenance
Date: 2021-08-07 07:52:24
Message-ID: CALNJ-vRSVtqZDXU6-DjdycFkjonxKs2TbBFb=W=W9Y7KG0rRwA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, Aug 7, 2021 at 12:00 AM Zhihong Yu <zyu(at)yugabyte(dot)com> wrote:

>
>
> On Sun, Aug 1, 2021 at 11:30 PM Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
>
>> Hi hackers,
>>
>> On Mon, 19 Jul 2021 09:24:30 +0900
>> Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp> wrote:
>>
>> > On Wed, 14 Jul 2021 21:22:37 +0530
>> > vignesh C <vignesh21(at)gmail(dot)com> wrote:
>>
>> > > The patch does not apply on Head anymore, could you rebase and post a
>> > > patch. I'm changing the status to "Waiting for Author".
>> >
>> > Ok. I'll update the patch in a few days.
>>
>> Attached is the latest patch set to add support for Incremental
>> Materialized View Maintenance (IVM)
>>
>> The patches are rebased to the master and also revised with some
>> code cleaning.
>>
>> IVM is a way to make materialized views up-to-date in which only
>> incremental changes are computed and applied on views rather than
>> recomputing the contents from scratch as REFRESH MATERIALIZED VIEW
>> does. IVM can update materialized views more efficiently
>> than recomputation when only small part of the view need updates.
>>
>> The patch set implements a feature so that materialized views could be
>> updated automatically and immediately when a base table is modified.
>>
>> Currently, our IVM implementation supports views which could contain
>> tuple duplicates whose definition includes:
>>
>> - inner and outer joins including self-join
>> - DISTINCT
>> - some built-in aggregate functions (count, sum, agv, min, and max)
>> - a part of subqueries
>> -- simple subqueries in FROM clause
>> -- EXISTS subqueries in WHERE clause
>> - CTEs
>>
>> We hope the IVM feature would be adopted into pg15. However, the size of
>> patch set has grown too large through supporting above features.
>> Therefore,
>> I think it is better to consider only a part of these features for the
>> first
>> release. Especially, I would like propose the following features for pg15.
>>
>> - inner joins including self-join
>> - DISTINCT and views with tuple duplicates
>> - some built-in aggregate functions (count, sum, agv, min, and max)
>>
>> By omitting outer-join, sub-queries, and CTE features, the patch size
>> becomes
>> less than half. I hope this will make a bit easer to review the IVM patch
>> set.
>>
>> Here is a list of separated patches.
>>
>> - 0001: Add a new syntax:
>> CREATE INCREMENTAL MATERIALIZED VIEW
>> - 0002: Add a new column relisivm to pg_class
>> - 0003: Add new deptype option 'm' in pg_depend
>> - 0004: Change trigger.c to allow to prolong life span of tupestores
>> containing Transition Tables generated via AFTER trigger
>> - 0005: Add IVM supprot for pg_dump
>> - 0006: Add IVM support for psql
>> - 0007: Add the basic IVM future:
>> This supports inner joins, DISTINCT, and tuple duplicates.
>> - 0008: Add aggregates (count, sum, avg, min, max) support for IVM
>> - 0009: Add regression tests for IVM
>> - 0010: Add documentation for IVM
>>
>> We could split the patch furthermore if this would make reviews much
>> easer.
>> For example, I think 0007 could be split into the more basic part and the
>> part
>> for handling tuple duplicates. Moreover, 0008 could be split into
>> "min/max"
>> and other aggregates because handling min/max is a bit more complicated
>> than
>> others.
>>
>> I also attached IVM_extra.tar.gz that contains patches for sub-quereis,
>> outer-join, CTE support, just for your information.
>>
>> Regards,
>> Yugo Nagata
>>
>> --
>> Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
>>
>>
>> --
>> Yugo NAGATA <nagata(at)sraoss(dot)co(dot)jp>
>>
>
> Hi,
> For v23-0007-Add-Incremental-View-Maintenance-support.patch :
>
> bq. In this implementation, AFTER triggers are used to collecting
> tuplestores
>
> 'to collecting' -> to collect
>
> bq. are contained in a old transition table.
>
> 'a old' -> an old
>
> bq. updates of more than one base tables
>
> one base tables -> one base table
>
> bq. DISTINCT and tuple duplicates in views are supported
>
> Since distinct and duplicate have opposite meanings, it would be better to
> rephrase the above sentence.
>
> bq. The value in__ivm_count__ is updated
>
> I searched the patch for in__ivm_count__ - there was no (second) match. I
> think there should be a space between in and underscore.
>
> +static void CreateIvmTriggersOnBaseTables_recurse(Query *qry, Node *node,
> Oid matviewOid, Relids *relids, bool ex_lock);
>
> nit: long line. please wrap.
>
> + if (rewritten->distinctClause)
> + rewritten->groupClause = transformDistinctClause(NULL,
> &rewritten->targetList, rewritten->sortClause, false);
> +
> + /* Add count(*) for counting distinct tuples in views */
> + if (rewritten->distinctClause)
>
> It seems the body of the two if statements can be combined into one.
>
> More to follow for this patch.
>
> Cheers
>
Hi,

+ CreateIvmTriggersOnBaseTables_recurse(qry, (Node *)qry, matviewOid,
&relids, ex_lock);

Looking at existing recursive functions, e.g.

src/backend/executor/execPartition.c:find_matching_subplans_recurse(PartitionPruningData
*prunedata,

the letters in the function name are all lower case. I think following the
convention would be nice.

+ if (rte->rtekind == RTE_RELATION)
+ {
+ if (!bms_is_member(rte->relid, *relids))

The conditions for the two if statements can be combined (saving some
indentation).

+ check_stack_depth();
+
+ if (node == NULL)
+ return false;

It seems the node check can be placed ahead of the stack depth check.

+ * CreateindexOnIMMV

CreateindexOnIMMV -> CreateIndexOnIMMV

+ (errmsg("could not create an index on materialized view
\"%s\" automatically",

It would be nice to mention the reason is the lack of primary key.

+ /* create no index, just notice that an appropriate index is
necessary for efficient, IVM */

for efficient -> for efficiency.

Cheers

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Meskes 2021-08-07 08:13:30 Re: ECPG bug fix: DECALRE STATEMENT and DEALLOCATE, DESCRIBE
Previous Message Drouvot, Bertrand 2021-08-07 07:20:09 Re: [UNVERIFIED SENDER] Re: [bug] Logical Decoding of relation rewrite with toast does not reset toast_hash