Re: Default setting for enable_hashagg_disk

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: Peter Geoghegan <pg(at)bowt(dot)ie>
Cc: Stephen Frost <sfrost(at)snowman(dot)net>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Davis <pgsql(at)j-davis(dot)com>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, David Rowley <dgrowleyml(at)gmail(dot)com>, Justin Pryzby <pryzby(at)telsasoft(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk
Date: 2020-07-14 02:38:14
Message-ID: CAA4eK1KgTD5=7xcUrbnuOo2cYTCNHZjaByGf_rUZcWYv5x8uMw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Mon, Jul 13, 2020 at 9:50 PM Peter Geoghegan <pg(at)bowt(dot)ie> wrote:
>
> On Mon, Jul 13, 2020 at 6:13 AM Stephen Frost <sfrost(at)snowman(dot)net> wrote:
> > Yes, increasing work_mem isn't unusual, at all.
>
> It's unusual as a way of avoiding OOMs!
>
> > Eh? That's not at all what it looks like- they were getting OOM's
> > because they set work_mem to be higher than the actual amount of memory
> > they had and the Sort before the GroupAgg was actually trying to use all
> > that memory. The HashAgg ended up not needing that much memory because
> > the aggregated set wasn't actually that large. If anything, this shows
> > exactly what Jeff's fine work here is (hopefully) going to give us- the
> > option to plan a HashAgg in such cases, since we can accept spilling to
> > disk if we end up underestimate, or take advantage of that HashAgg
> > being entirely in memory if we overestimate.
>
> I very specifically said that it wasn't a case where something like
> hash_mem would be expected to make all the difference.
>
> > Having looked back, I'm not sure that I'm really in the minority
> > regarding the proposal to add this at this time either- there's been a
> > few different comments that it's too late for v13 and/or that we should
> > see if we actually end up with users seriously complaining about the
> > lack of a separate way to specify the memory for a given node type,
> > and/or that if we're going to do this then we should have a broader set
> > of options covering other nodes types too, all of which are positions
> > that I agree with.
>
> By proposing to do nothing at all, you are very clearly in a small
> minority. While (for example) I might have debated the details with
> David Rowley a lot recently, and you couldn't exactly say that we're
> in agreement, our two positions are nevertheless relatively close
> together.
>
> AFAICT, the only other person that has argued that we should do
> nothing (have no new GUC) is Bruce, which was a while ago now. (Amit
> said something similar, but has since softened his opinion [1]).
>

To be clear, my vote for PG13 is not to do anything till we have clear
evidence of regressions. In the email you quoted, I was trying to say
that due to parallelism we might not have the problem for which we are
planning to provide an escape-hatch or hash_mem GUC. I think the
reason for the delay in getting to the agreement is that there is no
clear evidence for the problem (user-reported cases or results of some
benchmarks like TPC-H) unless I have missed something.

Having said that, I understand that we have to reach some conclusion
to close this open item and if the majority of people are in-favor of
escape-hatch or hash_mem solution then we have to do one of those.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Daniel Gustafsson 2020-07-14 10:52:23 Re: ALTER SYSTEM between upgrades
Previous Message Bruce Momjian 2020-07-13 23:58:49 ALTER SYSTEM between upgrades

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-07-14 02:42:50 Re: POC and rebased patch for CSN based snapshots
Previous Message Fujii Masao 2020-07-14 02:19:36 Re: Transactions involving multiple postgres foreign servers, take 2