Re: Default setting for enable_hashagg_disk

From: David Rowley <dgrowleyml(at)gmail(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-07-11 01:19:39
Message-ID: CAApHDvpxbHHP566rRjJWgnfS0YOxR53EZTz5LHH-jcEKvqdj4g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Sat, 11 Jul 2020 at 12:47, David G. Johnston
<david(dot)g(dot)johnston(at)gmail(dot)com> wrote:
> The multiplier seems strictly better than "rely on work_mem alone, i.e., do nothing"; the detracting factor being one more GUC. Even if one wants to argue the solution is ugly or imperfect the current state seems worse and a more perfect option doesn't seem worth waiting for. The multiplier won't make every single upgrade a non-event but it provides a more than sufficient amount of control and in the worse case can be effectively ignored by setting it to 1.0.

My argument wasn't related to if the new GUC should be a multiplier of
work_mem or an absolute amount of memory. The point I was trying to
make was that the solution to add a GUC to allow users to increase the
memory Hash Join and Hash Agg for plans which don't contain any other
nodes types that use work_mem is the same as doing nothing. As of
today, those people could just increase work_mem. If we get hash_mem
or some variant that is a multiplier of work_mem, then that user is in
exactly the same situation for that plan. i.e there's no ability to
increase the memory allowances for Hash Agg alone.

If we have to have a new GUC, my preference would be hashagg_mem,
where -1 means use work_mem and a value between 64 and MAX_KILOBYTES
would mean use that value. We'd need some sort of check hook to
disallow 0-63. I really am just failing to comprehend why we're
contemplating changing the behaviour of Hash Join here. Of course, I
understand that that node type also uses a hash table, but why does
that give it the right to be involved in a change that we're making to
try and give users the ability to avoid possible regressions with Hash
Agg?

David

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message David G. Johnston 2020-07-11 01:35:43 Re: Default setting for enable_hashagg_disk
Previous Message David G. Johnston 2020-07-11 00:47:03 Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-07-11 01:35:07 Re: pg_regress cleans up tablespace twice.
Previous Message David G. Johnston 2020-07-11 00:47:03 Re: Default setting for enable_hashagg_disk