Re: Disk-based hash aggregate's cost model

From: Jeff Davis <pgsql(at)j-davis(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Disk-based hash aggregate's cost model
Date: 2020-09-04 19:33:24
Message-ID: d518e363381470aa4b8595281a8384017ddf80da.camel@j-davis.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, 2020-09-04 at 21:01 +0200, Tomas Vondra wrote:
> Wouldn't it be enough to just use a slot with smaller tuple
> descriptor?
> All we'd need to do is creating the descriptor in ExecInitAgg after
> calling find_hash_columns, and using it for rslot/wslot, and then
> "mapping" the attributes in hashagg_spill_tuple (which already almost
> does that, to the extra cost should be 0) and when reading the
> spilled
> tuples.

That's a good point, it's probably not much code to make it work.

> So I'm not quite buying the argument that this would make
> measurable difference ...

I meant "projection of all input tuples" (i.e. CP_SMALL_TLIST) has a
cost. If we project only at spill time, it should be fine.

Regards,
Jeff Davis

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Alvaro Herrera 2020-09-04 19:37:24 Re: A micro-optimisation for walkdir()
Previous Message Andres Freund 2020-09-04 19:11:31 Re: Improving connection scalability: GetSnapshotData()