Re: Memory-Bounded Hash Aggregation

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Richard Guo <guofenglinux(at)gmail(dot)com>
Cc: Jeff Davis <pgsql(at)j-davis(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Taylor Vesely <tvesely(at)pivotal(dot)io>, Adam Lee <ali(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory-Bounded Hash Aggregation
Date: 2020-03-27 01:31:08
Message-ID: 20200327013108.tiskardwkqr5otg6@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Mar 26, 2020 at 05:56:56PM +0800, Richard Guo wrote:
>Hello,
>
>When calculating the disk costs of hash aggregation that spills to disk,
>there is something wrong with how we determine depth:
>
>> depth = ceil( log(nbatches - 1) / log(num_partitions) );
>
>If nbatches is some number between 1.0 and 2.0, we would have a negative
>depth. As a result, we may have a negative cost for hash aggregation
>plan node, as described in [1].
>
>I don't think 'log(nbatches - 1)' is what we want here. Should it be
>just '(nbatches - 1)'?
>

I think using log() is correct, but why should we allow fractional
nbatches values between 1.0 and 2.0? You either have 1 batch or 2
batches, you can't have 1.5 batches. So I think the issue is here

nbatches = Max((numGroups * hashentrysize) / mem_limit,
numGroups / ngroups_limit );

and we should probably do

nbatches = ceil(nbatches);

right after it.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-03-27 01:32:39 Re: Some problems of recovery conflict wait events
Previous Message Tom Lane 2020-03-27 01:26:49 Race condition in SyncRepGetSyncStandbysPriority