Re: Implementing Incremental View Maintenance

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Jim Finnerty <jfinnert(at)amazon(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Implementing Incremental View Maintenance
Date: 2019-06-28 10:01:43
Message-ID: 20190628190143.0e006e12d77458ecff12cf54@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Jim,

On Fri, 21 Jun 2019 08:41:11 -0700 (MST)
Jim Finnerty <jfinnert(at)amazon(dot)com> wrote:

> Hi Yugo,
>
> I'd like to compare the performance of your MV refresh algorithm versus
> an approach that logs changes into an mv log table, and can then apply the
> changes at some later point in time. I'd like to handle the materialized
> join view (mjv) case first, specifically a 2-way left outer join, with a UDF
> in the SELECT list of the mjv.

Do you mean you have your implementation of IVM that using log tables?
I'm so interested in this, and I would appreciate it if you explain the
detail.

> Does your refresh algorithm handle mjv's with connected join graphs that
> consist entirely of inner and left outer joins?

> If so, I'd like to measure the overhead of your refresh algorithm on
> pgbench, modified to include an mjv, versus a (hand coded) incremental
> maintenance algorithm that uses mv log tables populated by ordinary
> triggers. We may also want to look at capturing the deltas using logical
> replication, which ought to be faster than a trigger-based solution.

In the current our implementation, outer joins is not yet supported though
we plan to handle this in future. So,we would not be able to compare these
directly in the same workload in the current status.

However, the current our implementation supports only the way to update
materialized views in a trigger, and the performance of modifying base tables
will be lower than the approach which uses log tables. This is because queries
to update materialized views are issued in the trigger. This is not only a
overhead itself, but also takes a lock on a materialized view, which has an ]
impact on concurrent execution performance.

In the previous our PoC, we implemented IVM using log tables, in which logs are
captured by triggers and materialized views are update incrementally by a user
command[1]. However, to implement log table approach, we need a infrastructure
to maintain these logs. For example, which logs are necessary and which logs
can be discarded, etc. We thought this is not trivial work, so we decided to
start from the current approach which doesn't use log tables. We are now
preparing to implement this in the next step because this is also needed to
support deferred maintenance of views.

[1] https://www.postgresql.eu/events/pgconfeu2018/schedule/session/2195-implementing-incremental-view-maintenance-on-postgresql/

I agree that capturing the deltas using logical decoding will be faster than
using a trigger although we haven't yet consider this well.

Best regadrds,
Yugo Nagata

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Etsuro Fujita 2019-06-28 10:15:56 Re: postgres_fdw: Minor improvement to postgresAcquireSampleRowsFunc
Previous Message Dmitry Dolgov 2019-06-28 09:57:21 Re: C testing for Postgres