Re: Memory-Bounded Hash Aggregation

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Melanie Plageman <melanieplageman(at)gmail(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, Taylor Vesely <tvesely(at)pivotal(dot)io>, Adam Lee <ali(at)pivotal(dot)io>, Melanie Plageman <mplageman(at)pivotal(dot)io>, Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Memory-Bounded Hash Aggregation
Date: 2020-02-21 00:56:38
Message-ID: 732b17848c2333e50f146e656a9c4f75ff818f05.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 2020-02-19 at 20:16 +0100, Tomas Vondra wrote:
> 1) explain.c currently does this:
>
> I wonder if we could show something for plain explain (without
> analyze).
> At least the initial estimate of partitions, etc. I know not showing
> those details until after execution is what e.g. sort does, but I
> find
> it a bit annoying.

Looks like you meant to include some example explain output, but I
think I understand what you mean. I'll look into it.

> 2) The ExecBuildAggTrans comment should probably explain "spilled".

Done.

> 3) I wonder if we need to invent new opcodes? Wouldn't it be simpler
> to
> just add a new flag to the agg_* structs instead? I haven't tried
> hacking
> this, so maybe it's a silly idea.

There was a reason I didn't do it this way, but I'm trying to remember
why. I'll look into this, also.

> 4) lookup_hash_entries says
>
> /* check to see if we need to spill the tuple for this grouping
> set */
>
> But that seems bogus, because AFAIK we can't spill tuples for
> grouping
> sets. So maybe this should say just "grouping"?

Yes, we can spill tuples for grouping sets. Unfortunately, I think my
tests (which covered this case previously) don't seem to be exercising
that path well now. I am going to improve my tests, too.

> 5) Assert(nbuckets > 0);

I did not repro this issue, but I did set a floor of 256 buckets.

> which fails with segfault at execution time:

Fixed. I was resetting the hash table context without setting the
pointers to NULL.

Thanks! I attached a new, rebased version. The fixes are quick fixes
for now and I will revisit them after I improve my test cases (which
might find more issues).

Regards,
Jeff Davis

Attachment Content-Type Size
hashagg-20200220.patch text/x-patch 104.4 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2020-02-21 01:49:16 Re: custom postgres launcher for tests
Previous Message Andrew Gierth 2020-02-21 00:35:03 Re: Protocol problem with GSSAPI encryption?