Re: PoC: Grouped base relation

From: David Rowley <david(dot)rowley(at)2ndquadrant(dot)com>
To: Antonin Houska <ah(at)cybertec(dot)at>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: PoC: Grouped base relation
Date: 2017-01-16 23:42:34
Message-ID: CAKJS1f-r+gijY8r+VRZC-LLrxHMhVnvRMPWNfnBoCpKbY1Ha_g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10 January 2017 at 06:56, Antonin Houska <ah(at)cybertec(dot)at> wrote:
> Before performing the final aggregation, we need to multiply sum(a.x) by
> count(j.*) because w/o the aggregation at base relation level the input
> of the query-level aggregation would look like
>
> a.i | a.x | b.j
> ----------------
> 1 | 3 | 1
> 1 | 4 | 1
> 1 | 3 | 1
> 1 | 4 | 1
>
> In other words, grouping of the base relation "b" below the join prevents the
> join from bringing per-group input set to the aggregate input multiple
> times. To compensate for this effect, I've added a new field "aggtransmultifn"
> to pg_aggregate catalog. It "multiplies" the aggregate state w/o repeated
> processing of the same input set many times. sum() is an example of an
> aggregate that needs such processing, avg() is one that does not.

First off, I'd like to say that making improvements in this area
sounds like a great thing. I'm very keen to see progress here.

I've been thinking about this aggtransmultifn and I'm not sure if it's
really needed. Adding a whole series of new transition functions is
quite a pain. At least I think so, and I have a feeling Robert might
agree with me.

Let's imagine some worst case (and somewhat silly) aggregate query:

SELECT count(*)
FROM million_row_table
CROSS JOIN another_million_row_table;

Today that's going to cause 1 TRILLION transitions! Performance will
be terrible.

If we pushed the aggregate down into one of those tables and performed
a Partial Aggregate on that, then a Finalize Aggregate on that single
row result (after the join), then that's 1 million transfn calls, and
1 million combinefn calls, one for each row produced by the join.

If we did it your way (providing I understand your proposal correctly)
there's 1 million transfn calls on one relation, then 1 million on the
other and then 1 multiplyfn call. which does 1000000 * 1000000

What did we save vs. using the existing aggcombinefn infrastructure
which went into 9.6? Using this actually costs us 1 extra function
call, right? I'd imagine the size of the patch to use aggcombinefn
instead would be a fraction of the size of the one which included all
the new aggmultiplyfns and pg_aggregate.h changes.

There's already a lot of infrastructure in there to help you detect
when this optimisation can be applied. For example,
AggClauseCosts.hasNonPartial will be true if any aggregates don't have
a combinefn, or if there's any DISTINCT or ORDER BY aggregates,
which'll also mean you can't apply the optimisation.

It sounds like a much more manageable project by using aggcombinefn
instead. Then maybe one day when we can detect if a join did not cause
any result duplication (i.e Unique Joins), we could finalise the
aggregates on the first call, and completely skip the combine state
altogether.

Thanks for your work on this.

Regards

David Rowley

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2017-01-17 01:00:39 Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling)
Previous Message Alvaro Herrera 2017-01-16 23:30:00 Re: patch: function xmltable