Re: Large Scale Aggregation (HashAgg Enhancement)

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Simon Riggs <simon(at)2ndquadrant(dot)com>
Cc: Rod Taylor <pg(at)rbt(dot)ca>, PostgreSQL Development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Large Scale Aggregation (HashAgg Enhancement)
Date: 2006-01-17 14:52:10
Message-ID: 28124.1137509530@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Simon Riggs <simon(at)2ndquadrant(dot)com> writes:
> On Mon, 2006-01-16 at 20:02 -0500, Tom Lane wrote:
>> But our idea of the number of batches needed can change during that
>> process, resulting in some inner tuples being initially assigned to the
>> wrong temp file. This would also be true for hashagg.

> So we correct that before we start reading the outer table.

Why? That would require a useless additional pass over the data. With
the current design, we can process and discard at least *some* of the
data in a temp file when we read it, but a reorganization pass would
mean that it *all* goes back out to disk a second time.

Also, you assume that we can accurately tell how many tuples will fit in
memory in advance of actually processing them --- a presumption clearly
false in the hashagg case, and not that easy to do even for hashjoin.
(You can tell the overall size of a temp file, sure, but how do you know
how it will split when the batch size changes? A perfectly even split
is unlikely.)

> OK, I see what you mean. Sounds like we should have a new definition for
> Aggregates, "Sort Insensitive" that allows them to work when the input
> ordering does not effect the result, since that case can be optimised
> much better when using HashAgg.

Please don't propose pushing this problem onto the user until it's
demonstrated that there's no other way. I don't want to become the
next Oracle, with forty zillion knobs that it takes a highly trained
DBA to deal with.

> But all of them sound ugly.

I was thinking along the lines of having multiple temp files per hash
bucket. If you have a tuple that needs to migrate from bucket M to
bucket N, you know that it arrived before every tuple that was assigned
to bucket N originally, so put such tuples into a separate temp file
and process them before the main bucket-N temp file. This might get a
little tricky to manage after multiple hash resizings, but in principle
it seems doable.

regards, tom lane

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2006-01-17 15:03:35 Re: [GENERAL] [PATCH] Better way to check for getaddrinfo function.
Previous Message Magnus Hagander 2006-01-17 14:44:30 Re: Docs off on ILIKE indexing?