Re: Combining Aggregates

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>
Cc: Simon Riggs <simon(at)2ndquadrant(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>
Subject: Re: Combining Aggregates
Date: 2014-12-17 13:14:12
Message-ID: CAOeZVid3R6SV7R2EFvK36YzWMEU3g5rYJKAUNQqKcP3crTFMew@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 17, 2014 at 6:05 PM, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>
> Simon,
>
> Its concept is good to me. I think, the new combined function should be
> responsible to take a state data type as argument and update state object
> of the aggregate function. In other words, combined function performs like
> transition function but can update state object according to the summary
> of multiple rows. Right?
>
> It also needs some enhancement around Aggref/AggrefExprState structure to
> inform which function shall be called on execution time.
> Combined functions are usually no-thank-you. AggrefExprState updates its
> internal state using transition function row-by-row. However, once someone
> push-down aggregate function across table joins, combined functions have
> to be called instead of transition functions.
> I'd like to suggest Aggref has a new flag to introduce this aggregate
> expects
> state object instead of scalar value.
>
> Also, I'd like to suggest one other flag in Aggref not to generate final
> result, and returns state object instead.
>
>
>
So are you proposing not calling transfuncs at all and just use combined
functions?

That sounds counterintuitive to me. I am not able to see why you would want
to avoid transfns totally even for the case of pushing down aggregates that
you mentioned.

From Simon's example mentioned upthread:

PRE-AGGREGATED PLAN
Aggregate
-> Join
-> PreAggregate (doesn't call finalfn)
-> Scan BaseTable1
-> Scan BaseTable2

finalfn wouldnt be called. Instead, combined function would be responsible
for getting preaggregate results and combining them (unless of course, I am
missing something).

Special casing transition state updating in Aggref seems like a bad idea to
me. I would think that it would be better if we made it more explicit i.e.
add a new node on top that does the combination (it would be primarily
responsible for calling combined function).

Not a good source of inspiration, but seeing how SQL Server does it
(Exchange operator + Stream Aggregate) seems intuitive to me, and having
combination operation as a separate top node might be a cleaner way.

I may be wrong though.

Regards,

Atri

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2014-12-17 13:39:09 Re: WALWriter active during recovery
Previous Message Andrew Dunstan 2014-12-17 13:14:04 Re: POLA violation with \c service=