Re: Memory-Bounded Hash Aggregation

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Taylor Vesely <tvesely(at)pivotal(dot)io>, Adam Lee <ali(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory-Bounded Hash Aggregation
Date: 2019-12-27 23:35:30
Message-ID: d72e494db7b075ef02c3163e6614e698701d2c3e.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 2019-12-14 at 18:32 +0100, Tomas Vondra wrote:
> So I think we're not costing the batching properly / at all.

Hi,

I've attached a new patch that adds some basic costing for disk during
hashagg.

The accuracy is unfortunately not great, especially at smaller work_mem
sizes and smaller entry sizes. The biggest discrepency seems to be the
estimate for the average size of an entry in the hash table is
significantly smaller than the actual average size. I'm not sure how
big of a problem this accuracy is or how it compares to sort, for
instance (it's a bit hard to compare because sort works with
theoretical memory usage while hashagg looks at actual allocated
memory).

Costing was the last major TODO, so I'm considering this feature
complete, though it still needs some work on quality.

Regards,
Jeff Davis

Attachment Content-Type Size
hashagg-20191227.patch text/x-patch 100.3 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message legrand legrand 2019-12-27 23:42:04 Re: Implementing Incremental View Maintenance
Previous Message Tom Lane 2019-12-27 22:36:56 Re: BUG #16059: Tab-completion of filenames in COPY commands removes required quotes