Re: Default setting for enable_hashagg_disk

From: "David G(dot) Johnston" <david(dot)g(dot)johnston(at)gmail(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>
Cc: Peter Geoghegan <pg(at)bowt(dot)ie>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-07-09 23:57:22
Message-ID: CAKFQuwbnEP0cqYOyWiigyRpkcjArzR6O7YFXpra9VLAXtRkbQg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Thu, Jul 9, 2020 at 3:58 PM Stephen Frost <sfrost(at)snowman(dot)net> wrote:

> > > If folks
> > > want to let HashAgg use more memory then they can set work_mem higher,
> > > just the same as if they want a Sort node to use more memory or a
> > > HashJoin. Yes, that comes with potential knock-on effects about other
> > > nodes (possibly) using more memory but that's pretty well understood
> for
> > > all the other cases and I don't think that it makes sense to have a
> > > special case for HashAgg when the only justification is that "well, you
> > > see, it used to have this bug, so...".
> >
> > That's not the only justification. The other justification is that
> > it's generally reasonable to prefer giving hash aggregate more memory.
>
> Sure, and it's generally reasonably to prefer giving Sorts more memory
> too... as long as you've got it available.
>

Looking at the docs for work_mem it was decided to put "such as" before
"sort" and "hash table" even though the rest of the paragraph then only
talks about those two. Are there other things possible that warrant the
"such as" qualifier or can we write "specifically, a sort, or a hash table"?

For me, as a user that doesn't presently need to deal with all this, I'd
rather have a multiplier GUC for max_hash_work_mem_units defaulting to
something like 4. The planner would then use that multiple. We've closed
the "bug" while still giving me a region of utility that emulates the v12
reality without me touching anything, or even being aware of the bug that
is being fixed.

I cannot see myself wanting to globally revert to v12 behavior on the
execution side as the OOM side-effect is definitely more unpleasant than
slowed queries. If I have to go into a specific query anyway I'd go for a
measured change on the work_mem or multiplier rather than choosing to
consume as much memory as needed.

David J.

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Stephen Frost 2020-07-10 00:08:42 Re: Default setting for enable_hashagg_disk
Previous Message Peter Geoghegan 2020-07-09 23:42:25 Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

  From Date Subject
Next Message Cary Huang 2020-07-10 00:03:57 Re: pg_dump --where option
Previous Message Peter Geoghegan 2020-07-09 23:42:25 Re: Default setting for enable_hashagg_disk