Re: 9.5: Memory-bounded HashAgg

From: Tomas Vondra <tv(at)fuzzy(dot)cz>
To: pgsql-hackers(at)postgresql(dot)org
Subject: Re: 9.5: Memory-bounded HashAgg
Date: 2014-08-28 22:02:45
Message-ID: 53FFA705.5010209@fuzzy.cz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 26.8.2014 21:38, Jeff Davis wrote:
> On Tue, 2014-08-26 at 12:39 +0300, Heikki Linnakangas wrote:
>> I think this is enough for this commitfest - we have consensus on
>> the design. For the next one, please address those open items, and
>> resubmit.
>
> Agreed, return with feedback.
>
> I need to get the accounting patch in first, which needs to address
> some performance issues, but there's a chance of wrapping those up
> quickly.

Sounds good to me.

I'd like to coordinate our efforts on this a bit, if you're interested.

I've been working on the hashjoin-like batching approach PoC (because I
proposed it, so it's fair I do the work), and I came to the conclusion
that it's pretty much impossible to implement on top of dynahash. I
ended up replacing it with a hashtable (similar to the one in the
hashjoin patch, unsurprisingly), which supports the batching approach
well, and is more memory efficient and actually faster (I see ~25%
speedup in most cases, although YMMV).

I plan to address this in 4 patches:

(1) replacement of dynahash by the custom hash table (done)

(2) memory accounting (not sure what's your plan, I've used the
approach I proposed on 23/8 for now, with a few bugfixes/cleanups)

(3) applying your HashWork patch on top of this (I have this mostly
completed, but need to do more testing over the weekend)

(4) extending this with the batching I proposed, initially only for
aggregates with states that we can serialize/deserialize easily
(e.g. types passed by value) - I'd like to hack on this next week

So at this point I have (1) and (2) pretty much ready, (3) is almost
complete and I plan to start hacking on (4). Also, this does not address
the open items listed in your message.

But I agree this is more complex than the patch you proposed. So if you
choose to pursue your patch, I have no problem with that - I'll then
rebase my changes on top of your patch and submit them separately.

regards
Tomas

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2014-08-28 22:05:03 Re: Function to know last log write timestamp
Previous Message Robert Haas 2014-08-28 22:01:41 Re: Per table autovacuum vacuum cost limit behaviour strange