Re: Disk-based hash aggregate's cost model

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disk-based hash aggregate's cost model
Date: 2020-09-04 19:01:37
Message-ID: 20200904190137.scrucxyjkxyc2bmk@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Sep 04, 2020 at 11:31:36AM -0700, Jeff Davis wrote:
>On Fri, 2020-09-04 at 14:56 +0200, Tomas Vondra wrote:
>> Those charts show that the CP_SMALL_TLIST resulted in smaller temp
>> files
>> (per EXPLAIN ANALYZE the difference is ~25%) and also lower query
>> durations (also in the ~25% range).
>
>I was able to reproduce the problem, thank you.
>
>Only two attributes are needed, so the CP_SMALL_TLIST projected schema
>only needs a single-byte null bitmap.
>
>But if just setting the attributes to NULL rather than projecting them,
>the null bitmap size is based on all 16 attributes, bumping the bitmap
>size to two bytes.
>
>MAXALIGN(23 + 1) = 24
>MAXALIGN(23 + 2) = 32
>
>I think that explains it. It's not ideal, but projection has a cost as
>well, so I don't think we necessarily need to do something here.
>
>If we are motivated to improve this in v14, we could potentially have a
>different schema for spilled tuples, and perform real projection at
>spill time. But I don't know if that's worth the extra complexity.
>

Thanks for the investigation and explanation.

Wouldn't it be enough to just use a slot with smaller tuple descriptor?
All we'd need to do is creating the descriptor in ExecInitAgg after
calling find_hash_columns, and using it for rslot/wslot, and then
"mapping" the attributes in hashagg_spill_tuple (which already almost
does that, to the extra cost should be 0) and when reading the spilled
tuples. So I'm not quite buying the argument that this would make
measurable difference ...

That being said, I won't insist on fixing this in v13 - at least we know
what the issue is and we can fix it later. The costing seems like a more
serious open item.

OTOH I don't think this example is particularly extreme, and I wouldn't
be surprised if we se even worse examples in practice - tables tend to
be quite wide and aggregation of just a few columns seems likely.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Fabrízio de Royes Mello 2020-09-04 19:10:53 Re: pg_dump bug for extension owned tables
Previous Message Alvaro Herrera 2020-09-04 19:00:26 Re: report expected contrecord size