Re: Memory-Bounded Hash Aggregation

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Taylor Vesely <tvesely(at)pivotal(dot)io>, Adam Lee <ali(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Memory-Bounded Hash Aggregation
Date: 2020-03-15 23:05:37
Message-ID: b2af57f4f8a27b34a0efb80ce823f5cd2bbb9164.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, 2020-03-12 at 16:01 -0500, Justin Pryzby wrote:
> I don't understand what's meant by "the chosen plan".
> Should it say, "at execution ..." instead of "execution time" ?

I removed that wording; hopefully it's more clear without it?

> Either remove "plan types" for consistency with
> enable_groupingsets_hash_disk,
> Or add it there. Maybe it should say "when the memory usage would
> OTHERWISE BE
> expected to exceed.."

I added "plan types".

I don't think "otherwise be..." would quite work there. "Otherwise"
sounds to me like it's referring to another plan type (e.g.
Sort+GroupAgg), and that doesn't fit.

It's probably best to leave that level of detail out of the docs. I
think the main use case for enable_hashagg_disk is for users who
experience some plan changes and want the old behavior which favors
Sort when there are a lot of groups.

> +show_hashagg_info(AggState *aggstate, ExplainState *es)
> +{
> + Agg *agg = (Agg *)aggstate->ss.ps.plan;
> + long memPeakKb = (aggstate->hash_mem_peak + 1023) / 1024;
>
> I see this partially duplicates my patch [0] to show memory stats for

...

> You added hash_mem_peak and hash_batches_used to struct AggState.
> In my 0001 patch, I added instrumentation to struct TupleHashTable

I replied in that thread and I'm not sure that tracking the memory in
the TupleHashTable is the right approach. The group keys and the
transition state data can't be estimated easily that way. Perhaps we
can do that if the THT owns the memory contexts (and can call
MemoryContextMemAllocated()), rather than using passed-in ones, but
that might require more discussion. (I'm open to that idea, by the
way.)

Also, my patch also considers the buffer space, so would that be a
third memory number?

For now, I think I'll leave the way I report it in a simpler form and
we can change it later as we sort out these details. That leaves mine
specific to HashAgg, but we can always refactor it later.

I did change my code to put the metacontext in a child context of its
own so that I could call MemoryContextMemAllocated() on it to include
it in the memory total, and that will make reporting it separately
easier when we want to do so.

> + if (from_tape)
> + partition_mem += HASHAGG_READ_BUFFER_SIZE;
> + partition_mem = npartitions * HASHAGG_WRITE_BUFFER_SIZE;
>
> => That looks wrong ; should say += ?

Good catch! Fixed.

Regards,
Jeff Davis

Attachment Content-Type Size
hashagg-20200315.patch text/x-patch 80.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2020-03-15 23:17:51 More weird stuff in polymorphic type resolution
Previous Message Justin Pryzby 2020-03-15 21:27:29 Re: pg_ls_tmpdir to show directories and shared filesets (and pg_ls_*)