Quick Links

Re: hashagg slowdown due to spill changes

From:	Andres Freund <andres(at)anarazel(dot)de>
To:	Jeff Davis <pgsql(at)j-davis(dot)com>
Cc:	pgsql-hackers(at)postgresql(dot)org
Subject:	Re: hashagg slowdown due to spill changes
Date:	2020-06-08 21:08:19
Message-ID:	20200608210819.edcfdqykv3ciwqxq@alap3.anarazel.de
Views:	Raw Message \| Whole Thread \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

Hi,

On 2020-06-08 13:41:29 -0700, Jeff Davis wrote:
> On Fri, 2020-06-05 at 21:11 -0700, Andres Freund wrote:
> > Before there was basically one call from nodeAgg.c to execGrouping.c
> > for
> > each tuple and hash table. Now it's a lot more complicated:
> > 1) nodeAgg.c: prepare_hash_slot()
> > 2) execGrouping.c: TupleHashTableHash()
> > 3) nodeAgg.c: lookup_hash_entry()
> > 4) execGrouping.c: LookupTupleHashEntryHash()
>
> The reason that I did it that way was to be able to store the hash
> along with the saved tuple (similar to what HashJoin does), which
> avoids recalculation.

That makes sense. But then you can just use a separate call into
execGrouping for that purpose.

> > Why isn't the flow more like this:
> > 1) prepare_hash_slot()
> > 2) if (aggstate->hash_spill_mode) goto 3; else goto 4
> > 3) entry = LookupTupleHashEntry(&hash); if (!entry)
> > hashagg_spill_tuple();
> > 4) InsertTupleHashEntry(&hash, &isnew); if (isnew) initialize(entry)
>
> I'll work up a patch to refactor this. I'd still like to see if we can
> preserve the calculate-hash-once behavior somehow.

Cool!

Greetings,

Andres Freund

In response to

Re: hashagg slowdown due to spill changes at 2020-06-08 20:41:29 from Jeff Davis

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	Andres Freund	2020-06-08 21:27:39	Re: Bump default wal_level to logical
Previous Message	Jeff Davis	2020-06-08 20:55:47	Re: hashagg slowdown due to spill changes