Re: Default setting for enable_hashagg_disk

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-06-29 15:06:55
Message-ID: 20200629150655.ethstxkf3bkqoown@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Sun, Jun 28, 2020 at 06:39:40PM -0700, Peter Geoghegan wrote:
>On Sat, Jun 27, 2020 at 3:41 AM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> Well, there are multiple ideas discussed in this (sub)thread, one of
>> them being a per-query memory limit. That requires decisions how much
>> memory should different nodes get, which I think would need to be
>> cost-based.
>
>A design like that probably makes sense. But it's way out of scope for
>Postgres 13, and not something that should be discussed further on
>this thread IMV.
>

100% agree

>> Of course, a simpler scheme like this would not require that. And maybe
>> introducing hash_mem is a good idea - I'm not particularly opposed to
>> that, actually. But I think we should not introduce separate memory
>> limits for each node type, which was also mentioned earlier.
>
>I had imagined that hash_mem would apply to hash join and hash
>aggregate only. A GUC that either represents a multiple of work_mem,
>or an absolute work_mem-style KB value.
>

I'm not against having a hash_mem (and I'd vote to be it a simple
work_mem-style value, not a multiple). Maybe we should have it, the
argument to allow hashagg (and perhaps hashjoin) to use more memory than
some other nodes seems convincing to me.

I'm just not sure which of the problems mentioned in this thread it
actually addresses ...

>> The problem of course is that hash_mem does not really solve the issue
>> discussed at the beginning of this thread, i.e. regressions due to
>> underestimates and unexpected spilling at execution time.
>
>Like Andres, I suspect that that's a smaller problem in practice. A
>hash aggregate that spills often has performance characteristics
>somewhat like a group aggregate + sort, anyway. I'm worried about
>cases where an *in-memory* hash aggregate is naturally far far faster
>than other strategies, and yet we can't use it -- despite the fact
>that Postgres 12 could "safely" do so. (It probably doesn't matter
>whether the slow plan that you get in Postgres 13 is a hash aggregate
>that spills, or something else -- this is not really a costing
>problem.)
>

Not sure I follow. Which cases do you mean when you say that 12 could
safely do them, but 13 won't? I see the following two cases:

a) Planner in 12 and 13 disagree about whether the hash table will fit
into work_mem.

I don't quite see why this would be the case (given the same cardinality
estimates etc.), though. That is, if 12 says "will fit" I'd expect 13 to
end up with the same conclusion. But maybe 13 has higher per-tuple
overhead or something? I know we set aside some memory for BufFiles, but
not when we expect the whole hash table to fit into memory.

b) Planner believes the hash table will fit, due to underestimate.

On 12, we'd just let the hash table overflow, which may be a win when
there's enough RAM and the estimate is not "too wrong". But it may
easily end with a sad OOM.

On 13, we'll just start spilling. True - people tend to use conservative
work_mem values exactly because of cases like this (which is somewhat
futile as the underestimate may be arbitrarily wrong) and also because
they don't know how many work_mem instances the query will use.

So yeah, I understand why people may not want to increase work_mem too
much, and maybe hash_work would be a way to get the "no spill" behavior.

>Besides, hash_mem *can* solve that problem to some extent. Other cases
>(cases where the estimates are so bad that hash_mem won't help) seem
>like less of a concern to me. To some extent, that's the price you pay
>to avoid the risk of an OOM.
>

True, avoiding the risk of OOM has it's cost.

>> The thread is getting a rather confusing mix of proposals how to fix
>> that for v13 and proposals how to improve our configuration of memory
>> limits :-(
>
>As I said to Amit in my last message, I think that all of the ideas
>that are worth pursuing involve giving hash aggregate nodes license to
>use more memory than other nodes. One variant involves doing so only
>at execution time, while the hash_mem idea involves formalizing and
>documenting that hash-based nodes are special -- and taking that into
>account during both planning and execution.
>

Understood. I mostly agree with this.

>> Interesting. What is not entirely clear to me how do these databases
>> decide how much should each node get during planning. With the separate
>> work_mem-like settings it's fairly obvious, but how do they do that with
>> the global limit (either per-instance or per-query)?
>
>I don't know, but that seems like a much more advanced way of
>approaching the problem. It isn't in scope here.
>

+1

>Perhaps I'm not considering some unintended consequence of the planner
>giving hash-based nodes extra memory "for free" in the common case
>where hash_mem exceeds work_mem (by 2x, say). But my guess is that
>that won't be a significant problem that biases the planner in some
>obviously undesirable way.
>

My concern is how much more difficult would these proposals make the
reasoning about memory usage get. Maybe not much, not sure.

I certainly agree it may be beneficial to give more memory to hashagg at
the expense of other nodes.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Bruce Momjian 2020-06-29 15:29:09 Re: Default setting for enable_hashagg_disk
Previous Message PG Doc comments form 2020-06-29 15:02:57 Create Procedure

Browse pgsql-hackers by date

  From Date Subject
Next Message Andrew Dunstan 2020-06-29 15:17:58 Re: pgsql: Enable Unix-domain sockets support on Windows
Previous Message Magnus Hagander 2020-06-29 15:02:43 Parallell hashjoin sometimes ignores temp_tablespaces