Re: Default setting for enable_hashagg_disk

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-06-29 21:22:29
Message-ID: 20200629212229.n3afgzq6xpxrr4cu@development
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Mon, Jun 29, 2020 at 10:20:14AM -0700, Peter Geoghegan wrote:
>On Mon, Jun 29, 2020 at 8:07 AM Tomas Vondra
><tomas(dot)vondra(at)2ndquadrant(dot)com> wrote:
>> Not sure I follow. Which cases do you mean when you say that 12 could
>> safely do them, but 13 won't? I see the following two cases:
>
>> a) Planner in 12 and 13 disagree about whether the hash table will fit
>> into work_mem.
>>
>> I don't quite see why this would be the case (given the same cardinality
>> estimates etc.), though. That is, if 12 says "will fit" I'd expect 13 to
>> end up with the same conclusion. But maybe 13 has higher per-tuple
>> overhead or something? I know we set aside some memory for BufFiles, but
>> not when we expect the whole hash table to fit into memory.
>
>I have no reason to believe that the planner is any more or any less
>likely to conclude that the hash table will fit in memory in v13 as
>things stand (I don't know if the BufFile issue matters).
>
>In general, grouping estimates probably aren't very good compared to
>join estimates. I imagine that in either v12 or v13 the planner is
>likely to incorrectly believe that it'll all fit in memory fairly
>often. v12 was much too permissive about what could happen. But v13 is
>too conservative.
>

Can you give and example of what you mean by being too permissive or too
conservative? Do you mean the possibility of unlimited memory usage in
v12, and strict enforcement in v13?

IMO enforcing the work_mem limit (in v13) is right in principle, but I
do understand the concerns about unexpected regressions compared to v12.

>> b) Planner believes the hash table will fit, due to underestimate.
>>
>> On 12, we'd just let the hash table overflow, which may be a win when
>> there's enough RAM and the estimate is not "too wrong". But it may
>> easily end with a sad OOM.
>
>It might end up with an OOM on v12 due to an underestimate -- but
>probably not! The fact that a hash aggregate is faster than a group
>aggregate ameliorates the higher memory usage. You might actually use
>less memory this way.
>

I don't understand what you mean by "less memory" when the whole issue
is significantly exceeding work_mem?

I don't think the OOM is the only negative performance here - using a
lot of memory also forces eviction of data from page cache (although
writing a lot of temporary files may have similar effect).

>> On 13, we'll just start spilling. True - people tend to use conservative
>> work_mem values exactly because of cases like this (which is somewhat
>> futile as the underestimate may be arbitrarily wrong) and also because
>> they don't know how many work_mem instances the query will use.
>>
>> So yeah, I understand why people may not want to increase work_mem too
>> much, and maybe hash_work would be a way to get the "no spill" behavior.
>
>Andres wanted to increase the amount of memory that could be used at
>execution time, without changing planning. You could say that hash_mem
>is a more ambitious proposal than that. It's changing the behavior
>across the board -- though in a way that makes sense anyway. It has
>the additional benefit of making it more likely that an in-memory hash
>aggregate will be used. That isn't a problem that we're obligated to
>solve now, so this may seem odd. But if the more ambitious plan is
>actually easier to implement and support, why not pursue it?
>
>hash_mem seems a lot easier to explain and reason about than having
>different work_mem budgets during planning and execution, which is
>clearly a kludge. hash_mem makes sense generally, and more or less
>solves the problems raised on this thread.
>

I agree with this, and I'm mostly OK with having hash_mem. In fact, from
the proposals in this thread I like it the most - as long as it's used
both during planning and execution. It's a pretty clear solution.

It's not a perfect solution in the sense that it does not reintroduce
the v12 behavior perfectly (i.e. we'll still spill after reaching
hash_mem) but that may be good enougn.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-docs by date

  From Date Subject
Next Message Tomas Vondra 2020-06-29 21:33:19 Re: Default setting for enable_hashagg_disk
Previous Message Peter Geoghegan 2020-06-29 17:36:28 Re: Default setting for enable_hashagg_disk

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2020-06-29 21:33:19 Re: Default setting for enable_hashagg_disk
Previous Message Daniel Gustafsson 2020-06-29 20:26:23 Re: Parallell hashjoin sometimes ignores temp_tablespaces