Re: Default setting for enable_hashagg_disk (hash_mem)

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: David Rowley <dgrowleyml(at)gmail(dot)com>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk (hash_mem)
Date: 2020-07-08 03:26:31
Message-ID: CAA4eK1J4xkhYhOPwovEZ2VcWkByU75pkZnK022OXyJqTE94NVg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Wed, Jul 8, 2020 at 7:28 AM David Rowley <dgrowleyml(at)gmail(dot)com> wrote:
>
>
> I'd really like to see this thread move forward to a solution and I'm
> not sure how best to do that. I started by reading back over both this
> thread and the original one and tried to summarise what people have
> suggested.
>

Thanks, I think this might help us in reaching some form of consensus
by seeing what most people prefer.

> I understand some people did change their minds along the way, so I
> may have made some mistakes. I could have assumed the latest mindset
> overruled, but it was harder to determine that due to the thread being
> split.
>

> For hash_mem = Justin [16], PeterG [15], Tomas [7]
> hash_mem out of scope for PG13 = Bruce [8], Andres [9]
>

+1 for hash_mem out of scope for PG13. Apart from the reasons you
have mentioned above, the other reason is if this is a way to allow
users to get a smooth experience for hash aggregates, then I think the
idea proposed by Robert is not yet ruled out and we should see which
one is better. OTOH, if we want to see this as a way to give smooth
experience for current use cases for hash aggregates and improve the
situation for hash joins as well then I think this seems to be a new
behavior which should be discussed for PG14. Having said that, I am
not saying this is not a good idea but just I don't think we should
pursue it for PG13.

> Wait for reports from users = Amit [10]

I think this is mostly inline with Bruce is intending to say ("Maybe
do nothing until we see how things go during beta"). So, probably we
can club the votes.

> Escape hatch that can be removed later when we get something better =
> Jeff [11], David [12], Pavel [13], Andres [14], Justin [1]
> Add enable_hashagg_spill = Tom [2] (I'm unclear on this proposal. Does
> it affect the planner or executor or both?)
> Maybe do nothing until we see how things go during beta = Bruce [3]
> Just let users set work_mem = Alvaro [4] (I think he changed his mind
> after Andres pointed out that changes other nodes in the plan too)
> Swap enable_hashagg for a GUC that specifies when spilling should
> occur. -1 means work_mem = Robert [17], Amit [18]
> hash_mem does not solve the problem = Tomas [6]
>

[1] - https://www.postgresql.org/message-id/CA+TgmobyV9+T-Wjx-cTPdQuRCgt1THz1mL3v1NXC4m4G-H6Rcw@mail.gmail.com

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message murali maheswaram 2020-07-08 06:59:06 Request for TDE Implementation Doc for Centos
Previous Message David Rowley 2020-07-08 01:57:45 Re: Default setting for enable_hashagg_disk (hash_mem)

Browse pgsql-hackers by date

  From Date Subject
Next Message David Rowley 2020-07-08 03:37:09 Re: Hybrid Hash/Nested Loop joins and caching results from subplans
Previous Message David Gilman 2020-07-08 03:19:35 Re: Warn when parallel restoring a custom dump without data offsets