Re: Default setting for enable_hashagg_disk

From: Stephen Frost <sfrost(at)snowman(dot)net>
To: Jeff Davis <pgsql(at)j-davis(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Peter Geoghegan <pg(at)bowt(dot)ie>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-07-10 14:34:15
Message-ID: 20200710143415.GJ12375@tamriel.snowman.net
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

Greetings,

* Jeff Davis (pgsql(at)j-davis(dot)com) wrote:
> In principle, Stephen is right: the v12 behavior is a bug, lots of
> people are unhappy about it, it causes real problems, and it would not
> be acceptable if proposed today. Otherwise I wouldn't have spent the
> time to fix it.
>
> Similarly, potential regressions are not the "fault" of my feature --
> they are the fault of the limitations of work_mem, the limitations of
> the planner, the wrong expectations from customers, or just
> happenstance.

Exactly.

> But at a certain point, I have to weigh the potential anger of
> customers hitting regressions versus the potential anger of hackers
> seeing a couple extra GUCs. I have to say that I am more worried about
> the former.

We work, quite intentionally, to avoid having a billion knobs that
people have to understand and to tune. Yes, we could create a bunch of
new GUCs to change all kinds of behavior, and we could add hints while
we're at it, but there's been quite understandable and good pressure
against doing so because much of the point of this database system is
that it should be figuring out the best plan on its own and within the
constraints that users have configured.

> If there is some more serious consequence of adding a GUC that I missed
> in this thread, please let me know. Otherwise, I intend to commit a new
> GUC shortly that will enable users to bypass work_mem for HashAgg, just
> as in v12.

I don't think this thread has properly considered that every new GUC,
every additional knob that we create, increases the complexity of the
system for users to have to deal with and, in some sense, creates a
failure of ours to be able to just figure out what the right answer
is. For such a small set of users, who somehow have a problem with a
Sort taking up more memory but are fine with HashAgg doing so, I don't
think the requirement is met that this is a large enough issue to
warrant a new GUC. Users who are actually hit by this in a negative way
have an option- increase work_mem to reflect what was actually happening
already. I seriously doubt that we'd get tons of users complaining
about that answer or asking us to have something separate from that, and
we'd avoid adding some new GUC that has to be explained to every new
user to the system and complicate the documentation that explains how
work_mem works.

Thanks,

Stephen

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Peter Geoghegan 2020-07-10 17:26:27 Re: Default setting for enable_hashagg_disk
Previous Message Stephen Frost 2020-07-10 14:17:14 Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

  From Date Subject
Next Message Sascha Kuhl 2020-07-10 14:44:41 Re: WIP: BRIN multi-range indexes
Previous Message Stephen Frost 2020-07-10 14:17:14 Re: Default setting for enable_hashagg_disk