Re: Default setting for enable_hashagg_disk

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Stephen Frost <sfrost(at)snowman(dot)net>, Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-07-12 12:36:48
Message-ID: 20200712123648.js76j6ablk5nbxpo@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-docs pgsql-hackers

On Sat, Jul 11, 2020 at 10:26:22PM -0700, David G. Johnston wrote:
>On Saturday, July 11, 2020, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> "David G. Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com> writes:
>> > On Sat, Jul 11, 2020 at 5:47 PM Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
>> >> It seems like a lot of the disagreement here is focused on Peter's
>> >> proposal to make hash_mem_multiplier default to 2.0. But it doesn't
>> >> seem to me that that's a critical element of the proposal. Why not just
>> >> make it default to 1.0, thus keeping the default behavior identical
>> >> to what it is now?
>> > If we don't default it to something other than 1.0 we might as well just
>> > make it memory units and let people decide precisely what they want to
>> use
>> > instead of adding the complexity of a multiplier.
>> Not sure how that follows? The advantage of a multiplier is that it
>> tracks whatever people might do to work_mem automatically.
>I was thinking that setting -1 would basically do that.

I think Tom meant that the multiplier would automatically track any
changes to work_mem, and adjust the hash_mem accordingly. With -1 (and
the GUC in units) you could only keep it exactly equal to work_mem, but
then as soon as you change it you'd have to update both.

>> In general
>> I'd view work_mem as the base value that people twiddle to control
>> executor memory consumption. Having to also twiddle this other value
>> doesn't seem especially user-friendly.
>I’ll admit I don’t have a feel for what is or is not user-friendly when
>setting these GUCs in a session to override the global defaults. But as
>far as the global defaults I say it’s a wash between (32mb, -1) -> (32mb,
>48mb) and (32mb, 1.0) -> (32mb, 1.5)
>If you want 96mb for the session/query hash setting it to 96mb is
>invariant, whilesetting it to 3.0 means it can change in the future if the
>system work_mem changes. Knowing the multiplier is 1.5 and choosing 64mb
>for work_mem in the session is possible but also mutable and has
>side-effects. If the user is going to set both values to make it invariant
>we are back to it being a wash.
>I don’t believe using a multiplier will promote better comprehension for
>why this setting exists compared to “-1 means use work_mem but you can
>override a subset if you want.”
>Is having a session level memory setting be mutable something we want to
>Is it more user-friendly?

I still think it should be in simple units, TBH. We already have
somewhat similar situation with cost parameters, where we often say that
seq_page_cost = 1.0 is the baseline for the other cost parameters, yet
we have not coded that as multipliers.

>>> If we find that's a poor default, we can always change it later;
>> >> but it seems to me that the evidence for a higher default is
>> >> a bit thin at this point.
>> > So "your default is 1.0 unless you installed the new database on or after
>> > 13.4 in which case it's 2.0"?
>> What else would be new? See e.g. 848ae330a. (Note I'm not suggesting
>> that we'd change it in a minor release.)
>Minor release update is what I had thought, and to an extent was making
>possible by not using the multiplier upfront.
>I agree options are wide open come v14 and beyond.
>David J.


Tomas Vondra
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Daniel Gustafsson 2020-07-12 20:45:28 Re: Additional Chapter for Tutorial
Previous Message Tomas Vondra 2020-07-12 12:30:43 Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2020-07-12 12:42:26 Re: A patch for get origin from commit_ts.
Previous Message Tomas Vondra 2020-07-12 12:30:43 Re: Default setting for enable_hashagg_disk