Re: Implementing Incremental View Maintenance

From: Takuma Hoshiai <hoshiai(at)sraoss(dot)co(dot)jp>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
Cc: Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Kevin Grittner <kgrittn(at)gmail(dot)com>
Subject: Re: Implementing Incremental View Maintenance
Date: 2019-07-10 02:29:38
Message-ID: 20190710112938.802310778628d1c6f45d8689@sraoss.co.jp
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 10 Jul 2019 11:07:15 +0900
Takuma Hoshiai <takuma(dot)hoshiai(at)gmail(dot)com> wrote:

> Hi Thomas,
>
> 2019年7月8日(月) 15:32 Thomas Munro <thomas(dot)munro(at)gmail(dot)com>:
>
> > On Fri, Jun 28, 2019 at 10:56 PM Yugo Nagata <nagata(at)sraoss(dot)co(dot)jp> wrote:
> > > Attached is a WIP patch of IVM which supports some aggregate functions.
> >
> > Hi Nagata-san and Hoshiai-san,
> >
> > Thank you for working on this. I enjoyed your talk at PGCon. I've
> > added Kevin Grittner just in case he missed this thread; he has talked
> > often about implementing the counting algorithm, and he wrote the
> > "trigger transition tables" feature to support exactly this. While
> > integrating trigger transition tables with the new partition features,
> > we had to make a number of decisions about how that should work, and
> > we tried to come up with answers that would work for IMV, and I hope
> > we made the right choices!
> >
> > I am quite interested to learn how IVM interacts with SERIALIZABLE.
> >
>
> Nagata-san has been studying this. Nagata-san, any comment?
>
> > A couple of superficial review comments:
>
> Thank you for your review comments.
> Please find attached patches. The some of your review is reflected in patch
> too.

Sorry, I forgot attaching patch.
In addition, avg() function is supported newly. We found a issue
when use avg() with IVM, added its reproduction case in
regressio test. We are being to fix now.

> We manage and update IVM on following github repository.
> https://github.com/sraoss/pgsql-ivm
> you also can found latest WIP patch here.
>
>
> > + const char *aggname = get_func_name(aggref->aggfnoid);
> > ...
> > + else if (!strcmp(aggname, "sum"))
> >
> > I guess you need a more robust way to detect the supported aggregates
> > than their name, or I guess some way for aggregates themselves to
> > specify that they support this and somehow supply the extra logic.
> > Perhaps I just waid what Greg Stark already said, except not as well.
> >
>
> We have recognized the issue and are welcome any input.
>
> > + elog(ERROR, "Aggrege function %s is not
> > supported", aggname);
> >
> > s/Aggrege/aggregate/
> >
>
> I fixed this typo.
>
> > Of course it is not helpful to comment on typos at this early stage,
> > it's just that this one appears many times in the test output :-)
> >
> > +static bool
> > +isIvmColumn(const char *s)
> > +{
> > + char pre[7];
> > +
> > + strlcpy(pre, s, sizeof(pre));
> > + return (strcmp(pre, "__ivm_") == 0);
> > +}
> >
> > What about strncmp(s, "__ivm_", 6) == 0)?
>
>
> I agree with you, I fixed it.
>
> > As for the question of how
> > to reserve a namespace for system columns that won't clash with user
> > columns, according to our manual the SQL standard doesn't allow $ in
> > identifier names, and according to my copy SQL92 "intermediate SQL"
> > doesn't allow identifiers that end in an underscore. I don't know
> > what the best answer is but we should probably decide on a something
> > based the standard.
> >
> > As for how to make internal columns invisible to SELECT *, previously
> > there have been discussions about doing that using a new flag in
> > pg_attribute:
> >
> >
> > https://www.postgresql.org/message-id/flat/CAEepm%3D3ZHh%3Dp0nEEnVbs1Dig_UShPzHUcMNAqvDQUgYgcDo-pA%40mail.gmail.com
> >
> > + "WITH t AS ("
> > + " SELECT diff.__ivm_count__,
> > (diff.__ivm_count__ = mv.__ivm_count__) AS for_dlt, mv.ctid"
> > + ", %s"
> > + " FROM %s AS mv, %s AS diff WHERE (%s) =
> > (%s)"
> > + "), updt AS ("
> > + " UPDATE %s AS mv SET __ivm_count__ =
> > mv.__ivm_count__ - t.__ivm_count__"
> > + ", %s "
> > + " FROM t WHERE mv.ctid = t.ctid AND NOT
> > for_dlt"
> > + ") DELETE FROM %s AS mv USING t WHERE
> > mv.ctid = t.ctid AND for_dlt;",
> >
> > I fully understand that this is POC code, but I am curious about one
> > thing. These queries that are executed by apply_delta() would need to
> > be converted to C, or at least used reusable plans, right? Hmm,
> > creating and dropping temporary tables every time is a clue that the
> > ultimate form of this should be tuplestores and C code, I think,
> > right?
> >
>
> Nagata-san is investing the issue.
>
>
> > > Moreover, some regression test are added for aggregate functions support.
> > > This is Hoshiai-san's work.
> >
> > Great. Next time you post a WIP patch, could you please fix this
> > small compiler warning?
> >
> > describe.c: In function ‘describeOneTableDetails’:
> > describe.c:3270:55: error: ‘*((void *)&tableinfo+48)’ may be used
> > uninitialized in this function [-Werror=maybe-uninitialized]
> > if (verbose && tableinfo.relkind == RELKIND_MATVIEW && tableinfo.isivm)
> > ^
> > describe.c:1495:4: note: ‘*((void *)&tableinfo+48)’ was declared here
> > } tableinfo;
> > ^
> >
>
> It is fixed too.
>
> > Then our unofficial automatic CI system[1] will run these tests every
> > day, which sometimes finds problems.
> >
> > [1] cfbot.cputube.org
> >
> > --
> > Thomas Munro
> > https://enterprisedb.com
> >
> >
> Best regards,
>
> Takuma Hoshiai

--
Takuma Hoshiai <hoshiai(at)sraoss(dot)co(dot)jp>

Attachment Content-Type Size
WIP_immediate_IVM_v4.patch application/octet-stream 72.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ryan Lambert 2019-07-10 02:54:37 Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)
Previous Message Thomas Munro 2019-07-10 02:14:06 Re: Index Skip Scan