Re: Memory-Bounded Hash Aggregation

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory-Bounded Hash Aggregation
Date: 2019-07-12 06:59:55
Message-ID: 20190712065955.pm2rnk2hxa3qqawf@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jul 11, 2019 at 06:06:33PM -0700, Jeff Davis wrote:
>On Thu, 2019-07-11 at 17:55 +0200, Tomas Vondra wrote:
>> Makes sense. I haven't thought about how the hybrid approach would be
>> implemented very much, so I can't quite judge how complicated would
>> it be
>> to extend "approach 1" later. But if you think it's a sensible first
>> step,
>> I trust you. And I certainly agree we need something to compare the
>> other
>> approaches against.
>
>Is this a duplicate of your previous email?
>

Yes. I don't know how I managed to send it again. Sorry.

>I'm slightly confused but I will use the opportunity to put out another
>WIP patch. The patch could use a few rounds of cleanup and quality
>work, but the funcionality is there and the performance seems
>reasonable.
>
>I rebased on master and fixed a few bugs, and most importantly, added
>tests.
>
>It seems to be working with grouping sets fine. It will take a little
>longer to get good performance numbers, but even for group size of one,
>I'm seeing HashAgg get close to Sort+Group in some cases.
>

Nice! That's a very nice progress!

>You are right that the missed lookups appear to be costly, at least
>when the data all fits in system memory. I think it's the cache misses,
>because sometimes reducing work_mem improves performance. I'll try
>tuning the number of buckets for the hash table and see if that helps.
>If not, then the performance still seems pretty good to me.
>
>Of course, HashAgg can beat sort for larger group sizes, but I'll try
>to gather some more data on the cross-over point.
>

Yes, makes sense. I think it's acceptable as long as we consider this
during costing (when we know in advance we'll need this) or treat it to be
emergency measure.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro Horiguchi 2019-07-12 07:10:16 Re: Remove page-read callback from XLogReaderState.
Previous Message Pavel Stehule 2019-07-12 06:55:20 Re: Check-out mutable functions in check constraints