Re: Default setting for enable_hashagg_disk (hash_mem)

From: Peter Geoghegan <pg(at)bowt(dot)ie>
To: Justin Pryzby <pryzby(at)telsasoft(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Robert Haas <robertmhaas(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, David Rowley <dgrowleyml(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Melanie Plageman <melanieplageman(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Default setting for enable_hashagg_disk (hash_mem)
Date: 2020-07-03 03:56:29
Message-ID: CAH2-Wzn9sO++5NzRzdrmpjwT8e2yNgGU9wBicew6+1EnqjBp4A@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-docs pgsql-hackers

On Thu, Jul 2, 2020 at 7:46 PM Justin Pryzby <pryzby(at)telsasoft(dot)com> wrote:
> Thanks for putting it together, I agree that hash_mem seems to be an obvious
> "escape hatch" that generalizes existing GUCs and independently useful.

It is independently useful. It's a natural consequence of "being
honest" about work_mem and hashing.

> I feel it should same as work_mem, as it's written, and not a multiplier.
>
> And actually I don't think a lower value should be ignored: "mechanism not
> policy". Do we refuse atypical values of maintenance_work_mem < work_mem ?

I see your point, but AFAIK maintenance_work_mem was not retrofit like
this. It seems different. (Unless we add the -1 behavior, perhaps)

> I assumed that hash_mem would default to -1, which would mean "fall back to
> work_mem". We'd then advise users to increase it if they have an issue in v13
> with performance of hashes spilled to disk. (And maybe in other cases, too.)

Yeah, this kind of -1 behavior could make sense.

> I read the argument that hash tables are a better use of RAM than sort.
> However it seems like setting the default to greater than work_mem is a
> separate change than providing the mechanism allowing user to do so. I guess
> the change in default is intended to mitigate the worst possible behavior
> change someone might experience in v13 hashing, and might be expected to
> improve "out of the box" performance. I'm not opposed to it, but it's not an
> essential part of the patch.

That's true.

> In nodeHash.c, you missed an underscore:
> + * Target in-memory hashtable size is hashmem kilobytes.

Got it; thanks.

--
Peter Geoghegan

In response to

Browse pgsql-docs by date

  From Date Subject
Next Message Andrey Lizenko 2020-07-03 13:44:23 "Direct download" button is broken on www.postgresql.org/download/linux/redhat/
Previous Message Peter Geoghegan 2020-07-03 03:35:40 Re: Default setting for enable_hashagg_disk (hash_mem)

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2020-07-03 04:05:10 Re: track_planning causing performance regression
Previous Message Mitar 2020-07-03 03:42:39 Re: Persist MVCC forever - retain history