Re: Additional size of hash table is alway zero for hash aggregates

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Andrew Gierth <andrew(at)tao11(dot)riddles(dot)org(dot)uk>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Pengzhou Tang <ptang(at)pivotal(dot)io>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Additional size of hash table is alway zero for hash aggregates
Date: 2020-03-22 01:26:59
Message-ID: 20200322012659.5omhszhzqii3lq35@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-03-21 17:45:31 -0700, Jeff Davis wrote:
> Or, we can keep the 'additionalsize' argument but put it to work store
> the AggStatePerGroupData inline in the hash table. That would allow us
> to remove the 'additional' pointer from TupleHashEntryData, saving 8
> bytes plus the chunk header for every group. That sounds very tempting.

I don't see how? That'd require making the hash bucket addressing deal
with variable sizes, which'd be bad for performance reasons. Since there
can be a aggstate->numtrans AggStatePerGroupDatas for each hash table
entry, I don't see how to avoid a variable size?

> If we want to get even more clever, we could try to squish
> AggStatePerGroupData into 8 bytes by putting the flags
> (transValueIsNull and noTransValue) into unused bits of the Datum.
> That would work if the transtype is by-ref (low bits if pointer will
> be unused), or if the type's size is less than 8, or if the particular
> aggregate doesn't need either of those booleans. It would get messy,
> but saving 8 bytes per group is non-trivial.

I'm somewhat doubtful it's worth going for those per-type optimizations
- the wins don't seem large enough, relative to other per-group space
needs. Also adds additional instructions to fetching those values...

If we want to optimize memory usage, I think I'd first go for allocating
the group's "firstTuple" together with all the AggStatePerGroupDatas.

Greetings,

Andres Freund

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2020-03-22 02:18:01 doc review for parallel vacuum
Previous Message Peter Geoghegan 2020-03-22 00:59:25 Re: Why does [auto-]vacuum delay not report a wait event?