Re: explain HashAggregate to report bucket and memory stats

From: Andres Freund <andres(at)anarazel(dot)de>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Subject: Re: explain HashAggregate to report bucket and memory stats
Date: 2020-02-03 14:53:01
Message-ID: 20200203145301.53mozz7gcdaklnjc@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-01-03 10:19:26 -0600, Justin Pryzby wrote:
> On Sun, Feb 17, 2019 at 11:29:56AM -0500, Jeff Janes wrote:
> https://www.postgresql.org/message-id/CAMkU%3D1zBJNVo2DGYBgLJqpu8fyjCE_ys%2Bmsr6pOEoiwA7y5jrA%40mail.gmail.com
> > What would I find very useful is [...] if the HashAggregate node under
> > "explain analyze" would report memory and bucket stats; and if the Aggregate
> > node would report...anything.

Yea, that'd be amazing. It probably should be something every
execGrouping.c using node can opt into.

Justin: As far as I can tell, you're trying to share one instrumentation
state between hashagg and hashjoins. I'm doubtful that's a good
idea. The cases are different enough that that's probably just going to
be complicated, without actually simplifying anything.

> Jeff: can you suggest what details Aggregate should show ?

Memory usage most importantly. Probably makes sense to differentiate
between the memory for the hashtable itself, and the tuples in it (since
they're allocated separately, and just having a overly large hashtable
doesn't hurt that much if it's not filled).

> diff --git a/src/backend/executor/execGrouping.c b/src/backend/executor/execGrouping.c
> index 3603c58..cf0fe3c 100644
> --- a/src/backend/executor/execGrouping.c
> +++ b/src/backend/executor/execGrouping.c
> @@ -203,6 +203,11 @@ BuildTupleHashTableExt(PlanState *parent,
> hashtable->hash_iv = 0;
>
> hashtable->hashtab = tuplehash_create(metacxt, nbuckets, hashtable);
> + hashtable->hinstrument.nbuckets_original = nbuckets;
> + hashtable->hinstrument.nbuckets = nbuckets;
> + hashtable->hinstrument.space_peak = entrysize * hashtable->hashtab->size;

That's not actually an accurate accounting of memory, because for filled
entries a lot of memory is used to store actual tuples:

static TupleHashEntryData *
lookup_hash_entry(AggState *aggstate)
...
/* find or create the hashtable entry using the filtered tuple */
entry = LookupTupleHashEntry(perhash->hashtable, hashslot, &isnew);

if (isnew)
{
AggStatePerGroup pergroup;
int transno;

pergroup = (AggStatePerGroup)
MemoryContextAlloc(perhash->hashtable->tablecxt,
sizeof(AggStatePerGroupData) * aggstate->numtrans);
entry->additional = pergroup;

since the memory doesn't actually shrink unless the hashtable is
destroyed or reset, it'd probably be sensible to compute the memory
usage either at reset, or at the end of the query.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-02-03 15:56:09 Re: BUG #16171: Potential malformed JSON in explain output
Previous Message Tom Lane 2020-02-03 14:40:01 Re: Add %x to PROMPT1 and PROMPT2