Re: Implementing Incremental View Maintenance

From: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>
To: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Cc: Tatsuo Ishii <ishii(at)sraoss(dot)co(dot)jp>, pgsql-hackers(at)postgresql(dot)org, hoshiai(at)sraoss(dot)co(dot)jp, thomas(dot)munro(at)gmail(dot)com, kgrittn(at)gmail(dot)com
Subject: Re: Implementing Incremental View Maintenance
Date: 2019-11-29 08:33:28
Message-ID: 20191129173328.e5a0e9f81e369a3769c4fd0c@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 28 Nov 2019 11:03:33 -0300
Alvaro Herrera <alvherre(at)2ndquadrant(dot)com> wrote:

> One thing pending in this development line is how to catalogue aggregate
> functions that can be used in incrementally-maintainable views.
> I saw a brief mention somewhere that the devels knew it needed to be
> done, but I don't see in the thread that they got around to doing it.
> Did you guys have any thoughts on how it can be represented in catalogs?
> It seems sine-qua-non ...

Yes, this is a pending issue. Currently, supported aggregate functions are
identified their name, that is, we support aggregate functions named "count",
"sum", "avg", "min", or "max". As mentioned before, this is not robust
because there might be user-defined aggregates with these names although all
built-in aggregates can be used in IVM.

In our implementation, the new aggregate values are calculated using "+" and
"-" operations for sum and count, "/" for agv, and ">=" / "<=" for min/max.
Therefore, if there is a user-defined aggregate on a user-defined type which
doesn't support these operators, errors will raise. Obviously, this is a
problem. Even if these operators are defined, the semantics of user-defined
aggregate functions might not match with the way of maintaining views, and
resultant might be incorrect.

I think there are at least three options to prevent these problems.

In the first option, we support only built-in aggregates which we know able
to handle correctly. Supported aggregates can be identified using their OIDs.
User-defined aggregates are not supported. I think this is the simplest and
easiest way.

Second, supported aggregates can be identified using name, like the current
implementation, but also it is checked if required operators are defined. In
this case, user-defined aggregates are allowed to some extent and we can
prevent errors during IVM although aggregates value in view might be
incorrect if the semantics doesn't match.

Third, we can add a new attribute to pg_aggregate which shows if each
aggregate can be used in IVM. We don't need to use names or OIDs list of
supported aggregates although we need modification of the system catalogue.

Regarding pg_aggregate, now we have aggcombinefn attribute for supporting
partial aggregation. Maybe we could use combine functions to calculate new
aggregate values in IVM when tuples are inserted into a table. However, in
the context of IVM, we also need other function used when tuples are deleted
from a table, so we can not use partial aggregation for IVM in the current
implementation. This might be another option to implement "inverse combine
function"(?) for IVM, but I am not sure it worth.

Regards,
Yugo Nagata

>
> --
> Álvaro Herrera https://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Julien Rouhaud 2019-11-29 08:39:09 Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?
Previous Message Kyotaro Horiguchi 2019-11-29 08:14:21 Re: Remove page-read callback from XLogReaderState.