Re: hashagg slowdown due to spill changes

From: Andres Freund <andres(at)anarazel(dot)de>
To: Jeff Davis <pgsql(at)j-davis(dot)com>, Tomas Vondra <tv(at)fuzzy(dot)cz>
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, pgsql-hackers(at)postgresql(dot)org
Subject: Re: hashagg slowdown due to spill changes
Date: 2020-06-12 21:37:15
Message-ID: 20200612213715.op4ye4q7gktqvpuo@alap3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2020-06-11 11:14:02 -0700, Jeff Davis wrote:
> On Thu, 2020-06-11 at 10:45 -0700, Andres Freund wrote:
> > Did you run any performance tests?
>
> Yes, I reproduced your ~12% regression from V12, and this patch nearly
> eliminated it for me.

I spent a fair bit of time looking at the difference. Jeff had let me
know on chat that he was still seeing some difference, but couldn't
quite figure out where that was.

Trying it out myself, I observed that the patch helped, but not that
much. After a bit I found one major reason for why:
LookupTupleHashEntryHash() assigned the hash to pointer provided by the
caller's before doing the insertion. That ended up causing a pipeline
stall (I assume it's store forwarding, but not sure). Moving the
assignment to the caller variable to after the insertion got rid of
that.

It got within 3-4% after that change. I did a number of small
microoptimizations that each helped, but didn't get quite get to the
level of 12.

Finally I figured out that that's due to an issue outside of nodeAgg.c
itself:

commit 4cad2534da6d17067d98cf04be2dfc1bda8f2cd0
Author: Tomas Vondra <tomas(dot)vondra(at)postgresql(dot)org>
Date: 2020-05-31 14:43:13 +0200

Use CP_SMALL_TLIST for hash aggregate

Due to this change we end up with an additional projection in queries
like this:

postgres[212666][1]=# \d fewgroups_many_rows
Table "public.fewgroups_many_rows"
┌────────┬─────────┬───────────┬──────────┬─────────┐
│ Column │ Type │ Collation │ Nullable │ Default │
├────────┼─────────┼───────────┼──────────┼─────────┤
│ cat │ integer │ │ not null │ │
│ val │ integer │ │ not null │ │
└────────┴─────────┴───────────┴──────────┴─────────┘

postgres[212666][1]=# explain SELECT cat, count(*) FROM fewgroups_many_rows GROUP BY 1;
┌───────────────────────────────────────────────────────────────────────────────────────┐
│ QUERY PLAN │
├───────────────────────────────────────────────────────────────────────────────────────┤
│ HashAggregate (cost=1942478.48..1942478.53 rows=5 width=12) │
│ Group Key: cat │
│ -> Seq Scan on fewgroups_many_rows (cost=0.00..1442478.32 rows=100000032 width=4) │
└───────────────────────────────────────────────────────────────────────────────────────┘
(3 rows)

as 'val' is "projected away"..

After neutering the tlist change, Jeff's patch and my changes to it
yield performance *above* v12.

I don't see why it's ok to force an additional projection in the very
common case of hashaggs over a few rows. So I think we need to rethink
4cad2534da6.

Greetings,

Andres Freund

Attachment Content-Type Size
aggspeed.diff text/x-diff 15.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Ranier Vilela 2020-06-12 22:03:30 Re: Postgresql13_beta1 (could not rename temporary statistics file) Windows 64bits
Previous Message Bruce Momjian 2020-06-12 21:29:51 Re: create database with template doesn't copy ACL