Re: Default setting for enable_hashagg_disk

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-06-27 10:41:41
Message-ID: 20200627104141.gq7d3hm2tvoqgjjs@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Fri, Jun 26, 2020 at 05:24:36PM -0700, Peter Geoghegan wrote:
>On Fri, Jun 26, 2020 at 4:59 PM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> I agree larger work_mem for hashagg (and thus less spilling) may mean
>> lower work_mem for so some other nodes that are less sensitive to this.
>> But I think this needs to be formulated as a cost-based decision,
>> although I don't know how to do that for the reasons I explained before
>> (bottom-up plan construction vs. distributing the memory budget).
>
>Why do you think that it needs to be formulated as a cost-based
>decision? That's probably true of a scheme that allocates memory very
>intelligently, but what about an approach that's slightly better than
>work_mem?
>

Well, there are multiple ideas discussed in this (sub)thread, one of
them being a per-query memory limit. That requires decisions how much
memory should different nodes get, which I think would need to be
cost-based.

>What problems do you foresee (if any) with adding a hash_mem GUC that
>gets used for both planning and execution for hash aggregate and hash
>join nodes, in about the same way as work_mem is now?
>

Of course, a simpler scheme like this would not require that. And maybe
introducing hash_mem is a good idea - I'm not particularly opposed to
that, actually. But I think we should not introduce separate memory
limits for each node type, which was also mentioned earlier.

The problem of course is that hash_mem does not really solve the issue
discussed at the beginning of this thread, i.e. regressions due to
underestimates and unexpected spilling at execution time.

The thread is getting a rather confusing mix of proposals how to fix
that for v13 and proposals how to improve our configuration of memory
limits :-(

>> FWIW some databases already do something like this - SQL Server has
>> something called "memory grant" which I think mostly does what you
>> described here.
>
>Same is true of Oracle. But Oracle also has simple work_mem-like
>settings for sorting and hashing. People don't really use them anymore,
>but apparently it was once common for the DBA to explicitly give over
>more memory to hashing -- much like the hash_mem setting I asked about.
>IIRC the same is true of DB2.
>

Interesting. What is not entirely clear to me how do these databases
decide how much should each node get during planning. With the separate
work_mem-like settings it's fairly obvious, but how do they do that with
the global limit (either per-instance or per-query)?

>> The difference between sort and hashagg spills is that for sorts
>> there is no behavior change. Plans that did (not) spill in v12 will
>> behave the same way on v13, modulo some random perturbation. For
>> hashagg that's not the case - some queries that did not spill before
>> will spill now.
>>
>> So even if the hashagg spills are roughly equal to sort spills, both
>> are significantly more expensive than not spilling.
>
>Just to make sure we're on the same page: both are significantly more
>expensive than a hash aggregate not spilling *specifically*. OTOH, a
>group aggregate may not be much slower when it spills compared to an
>in-memory sort group aggregate. It may even be noticeably faster, due
>to caching effects, as you mentioned at one point upthread.
>
>This is the property that makes hash aggregate special, and justifies
>giving it more memory than other nodes on a system-wide basis (the same
>thing applies to hash join). This could even work as a multiplier of
>work_mem.
>

Yes, I agree.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2020-06-27 21:14:23 Re: INDEX with optional storage parameter value
Previous Message Amit Kapila 2020-06-27 10:00:25 Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-06-27 11:57:20 Re: pgsql: Enable Unix-domain sockets support on Windows
Previous Message Amit Kapila 2020-06-27 10:00:25 Re: Default setting for enable_hashagg_disk