Re: Implementing Incremental View Maintenance

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Takuma Hoshiai <hoshiai(at)sraoss(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: Implementing Incremental View Maintenance
Date: 2019-07-11 04:28:04
Message-ID: 20190711132804.c1a9fcde4082ea5a2f0d07fc@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi Thomas,

Thank you for your review and discussion on this patch!

> > 2019年7月8日(月) 15:32 Thomas Munro <thomas(dot)munro(at)gmail(dot)com>:
> >
> > > On Fri, Jun 28, 2019 at 10:56 PM Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > > > Attached is a WIP patch of IVM which supports some aggregate functions.
> > >
> > > Hi Nagata-san and Hoshiai-san,
> > >
> > > Thank you for working on this. I enjoyed your talk at PGCon. I've
> > > added Kevin Grittner just in case he missed this thread; he has talked
> > > often about implementing the counting algorithm, and he wrote the
> > > "trigger transition tables" feature to support exactly this. While
> > > integrating trigger transition tables with the new partition features,
> > > we had to make a number of decisions about how that should work, and
> > > we tried to come up with answers that would work for IMV, and I hope
> > > we made the right choices!

Transition tables is a great feature. I am now using this in my implementation
of IVM, but the first time I used this feature was when I implemented a PoC
for extending view updatability of PostgreSQL[1]. At that time, I didn't know
that this feature is made originally aiming to support IVM.

[1] https://www.pgcon.org/2017/schedule/events/1074.en.html

I think transition tables is a good choice to implement a statement level
immediate view maintenance where materialized views are refreshed in a statement
level after trigger. However, when implementing a transaction level immediate
view maintenance where views are refreshed per transaction, or deferred view
maintenance, we can't update views in a after trigger, and we will need an
infrastructure to manage change logs of base tables. Transition tables can be
used to collect these logs, but using logical decoding of WAL is another candidate.
In any way, if these logs can be collected in a tuplestore, we might able to
convert this to "ephemeral named relation (ENR)" and use this to calculate diff
sets for views.

> > >
> > > I am quite interested to learn how IVM interacts with SERIALIZABLE.
> > >
> >
> > Nagata-san has been studying this. Nagata-san, any comment?

In SERIALIZABLE or REPEATABLE READ level, table changes occurred in other
ransactions are not visible, so views can not be maintained correctly in AFTER
triggers. If a view is defined on two tables and each table is modified in
different concurrent transactions respectively, the result of view maintenance
done in trigger functions can be incorrect due to the race condition. This is the
reason why such transactions are aborted immediately in that case in my current
implementation.

One idea to resolve this is performing view maintenance in two phases. Firstly,
views are updated using only table changes visible in this transaction. Then,
just after this transaction is committed, views have to be updated additionally
using changes happened in other transactions to keep consistency. This is a just
idea, but to implement this idea, I think, we will need keep to keep and
maintain change logs.

> > > A couple of superficial review comments:


> > > + const char *aggname = get_func_name(aggref->aggfnoid);
> > > ...
> > > + else if (!strcmp(aggname, "sum"))
> > >
> > > I guess you need a more robust way to detect the supported aggregates
> > > than their name, or I guess some way for aggregates themselves to
> > > specify that they support this and somehow supply the extra logic.
> > > Perhaps I just waid what Greg Stark already said, except not as well.

Yes. Using name is not robust because users can make same name aggregates like
sum(text) (although I am not sure this makes sense). We can use oids instead
of names, but it would be nice to extend pg_aggregate and add new attributes
for informing that this supports IVM and for providing functions for IVM logic.

> > > As for the question of how
> > > to reserve a namespace for system columns that won't clash with user
> > > columns, according to our manual the SQL standard doesn't allow $ in
> > > identifier names, and according to my copy SQL92 "intermediate SQL"
> > > doesn't allow identifiers that end in an underscore. I don't know
> > > what the best answer is but we should probably decide on a something
> > > based the standard.

Ok, so we should use "__ivm_count__" since this ends in "_" at least.

Another idea is that users specify the name of this special column when
defining materialized views with IVM support. This way can avoid the conflict
because users will specify a name which does not appear in the target list.

As for aggregates supports, it may be also possible to make it a restriction
that count(expr) must be in the target list explicitly when sum(expr) or
avg(expr) is included, instead of making hidden column like __ivm_count_sum__,
like Oracle does.

> > >
> > > As for how to make internal columns invisible to SELECT *, previously
> > > there have been discussions about doing that using a new flag in
> > > pg_attribute:
> > >
> > >
> > > https://www.postgresql.org/message-id/flat/CAEepm%3D3ZHh%3Dp0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA%40mail.gmail.com

I agree implementing this feature in PostgreSQL since there are at least a few
use cases, IVM and temporal database.

> > >
> > > + "WITH t AS ("
> > > + " SELECT diff.__ivm_count__,
> > > (diff.__ivm_count__ = mv.__ivm_count__) AS for_dlt, mv.ctid"
> > > + ", %s"
> > > + " FROM %s AS mv, %s AS diff WHERE (%s) =
> > > (%s)"
> > > + "), updt AS ("
> > > + " UPDATE %s AS mv SET __ivm_count__ =
> > > mv.__ivm_count__ - t.__ivm_count__"
> > > + ", %s "
> > > + " FROM t WHERE mv.ctid = t.ctid AND NOT
> > > for_dlt"
> > > + ") DELETE FROM %s AS mv USING t WHERE
> > > mv.ctid = t.ctid AND for_dlt;",
> > >
> > > I fully understand that this is POC code, but I am curious about one
> > > thing. These queries that are executed by apply_delta() would need to
> > > be converted to C, or at least used reusable plans, right? Hmm,
> > > creating and dropping temporary tables every time is a clue that the
> > > ultimate form of this should be tuplestores and C code, I think,
> > > right?

I used SPI just because REFRESH CONCURRENTLY uses this, but, as you said,
it is inefficient to create/drop temp tables and perform parse/plan every times.
It seems to be enough to perform this once when creating materialized views or
at the first maintenance time.

Best regards,
Yugo Nagata

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2019-07-11 04:31:35 Re: POC: Cleaning up orphaned files using undo logs
Previous Message Michael Paquier 2019-07-11 04:04:33 Re: Add parallelism and glibc dependent only options to reindexdb