Re: Combining Aggregates

From: Atri Sharma <atri(dot)jiit(at)gmail(dot)com>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>
Subject: Re: Combining Aggregates
Date: 2014-12-17 14:15:31
Message-ID: CAOeZVifw7pkkXvpzrEZ9waqFPLOf_xz+bNRzLBHnZ1siSz=+mg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Dec 17, 2014 at 7:18 PM, Simon Riggs <simon(at)2ndquadrant(dot)com> wrote:
>
> On 17 December 2014 at 12:35, Kouhei Kaigai <kaigai(at)ak(dot)jp(dot)nec(dot)com> wrote:
>
> > Its concept is good to me. I think, the new combined function should be
> > responsible to take a state data type as argument and update state object
> > of the aggregate function. In other words, combined function performs
> like
> > transition function but can update state object according to the summary
> > of multiple rows. Right?
>
> That wasn't how I defined it, but I think your definition is better.
>
> > It also needs some enhancement around Aggref/AggrefExprState structure to
> > inform which function shall be called on execution time.
> > Combined functions are usually no-thank-you. AggrefExprState updates its
> > internal state using transition function row-by-row. However, once
> someone
> > push-down aggregate function across table joins, combined functions have
> > to be called instead of transition functions.
> > I'd like to suggest Aggref has a new flag to introduce this aggregate
> expects
> > state object instead of scalar value.
> >
> > Also, I'd like to suggest one other flag in Aggref not to generate final
> > result, and returns state object instead.
> >
> > Let me use David's example but little bit adjusted.
> >
> > original)
> > SELECT p.name, AVG(s.qty)
> > FROM sales s INNER JOIN product p ON s.product_id = s.product_id
> > GROUP BY p.name;
> >
> > modified)
> > SELECT p.name, AVG(qty)
> > FROM (SELECT product_id, AVG(qty) AS qty FROM sales GROUP BY
> product_id) s
> > INNER JOIN product p
> > ON p.product_id = s.product_id GROUP BY p_name;
> >
> > Let's assume the modifier set a flag of use_combine_func on the AVG(qty)
> of
> > the main query, and also set a flag of not_generate_final on the
> AVG(qty) of
> > the sub-query.
> > It shall work as we expected.
>
> That matches my thinking exactly.
>
> David, if you can update your patch with some docs to explain the
> behaviour, it looks complete enough to think about committing it in
> early January, to allow other patches that depend upon it to stand a
> chance of getting into 9.5. (It is not yet ready, but I see it could
> be).
>
> The above example is probably the best description of the need, since
> user defined aggregates must also understand this.
>
> Could we please call these "combine functions" or other? MERGE is an
> SQL Standard statement type that we will add later, so it will be
> confusing if we use the "merge" word in this context.
>
> David, your patch avoids creating any mergefuncs for existing
> aggregates. We would need to supply working examples for at least a
> few of the builtin aggregates, so we can test the infrastructure. We
> can add examples for all cases later.
>
>
>
I am still missing how and why we skip transfns. What am I missing,please?

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Simon Riggs 2014-12-17 14:16:01 Re: parallel mode and parallel contexts
Previous Message Heikki Linnakangas 2014-12-17 14:07:34 Re: GiST kNN search queue (Re: KNN-GiST with recheck)