Re: Large Scale Aggregation (HashAgg Enhancement)

From: Simon Riggs <simon(at)2ndquadrant(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Large Scale Aggregation (HashAgg Enhancement)
Date: 2006-01-17 19:17:37
Message-ID: 1137525457.3180.272.camel@localhost.localdomain
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, 2006-01-16 at 12:36 -0500, Tom Lane wrote:

> The tricky part is to preserve the existing guarantee that tuples are
> merged into their aggregate in arrival order. (This does not matter for
> the standard aggregates but it definitely does for custom aggregates,
> and there will be unhappy villagers appearing on our doorsteps if we
> break it.) I think this can work correctly under the above sketch but
> it needs to be verified. It might require different handling of the
> TODO files than what hashjoin does.

You almost had me there... but there isn't any "arrival order". The sort
that precedes an aggregation only sorts on the GROUP BY columns, not on
additional columns - so by the SQL standard there is not a guaranteed
ordering of the data into a aggregate. That is exactly what windowed
aggregates are for. (There isn't any way of specifying an ORDER BY yet
either).

The only way of doing this is by doing a derived table
select a, sum(b) from (select a,b order by a,b);
but AFAICS this is not part of the standard??

It is highly likely that rows are clumped together, but there just isn't
any guarantee that is the case. Any update of any row would change the
arrival order. Should we support something that has worked by luck?

I've been looking into windowed aggregates; these will provide this
functionality should people require it. I don't see how we'd be able to
do windowed aggregates and hashAgg at the same time, so this seems less
relevant.

Best Regards, Simon Riggs

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-01-17 19:41:20 Re: Large Scale Aggregation (HashAgg Enhancement)
Previous Message Jaime Casanova 2006-01-17 18:28:03 Re: Error working with Temporary Sequences in plpgsql in 8.1 (8.0 works fine)