Re: Out of Memory errors are frustrating as heck!

From: Justin Pryzby <pryzby(at)telsasoft(dot)com>
To: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Gunther <raj(at)gusw(dot)net>, pgsql-performance(at)lists(dot)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Out of Memory errors are frustrating as heck!
Date: 2019-04-21 16:40:22
Message-ID: 20190421164022.GD14223@telsasoft.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote:
> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> > The growEnabled stuff only prevents infinite loops. It doesn't prevent
> > extreme silliness.
>
> > If a single 32 bit hash value has enough tuples by itself to not fit in
> > work_mem, then it will keep splitting until that value is in a batch by
> > itself before shutting off
>
> I suspect, however, that we might be better off just taking the existence
> of the I/O buffers into account somehow while deciding whether it's worth
> growing further. That is, I'm imagining adding a second independent
> reason for shutting off growEnabled, along the lines of "increasing
> nbatch any further will require an unreasonable amount of buffer memory".
> The question then becomes how to define "unreasonable".

On Sun, Apr 21, 2019 at 06:15:25PM +0200, Tomas Vondra wrote:
> I think the question the code needs to be asking is "If we double the
> number of batches, does the amount of memory we need drop?" And the
> memory needs to account both for the buffers and per-batch data.
>
> I don't think we can just stop increasing the number of batches when the
> memory for BufFile exceeds work_mem, because that entirely ignores the
> fact that by doing that we force the system to keep the per-batch stuff
> in memory (and that can be almost arbitrary amount).
...
> Of course, this just stops enforcing work_mem at some point, but it at
> least attempts to minimize the amount of memory used.

This patch defines reasonable as "additional BatchFiles will not themselves
exceed work_mem; OR, exceeded work_mem already but additional BatchFiles are
going to save us RAM"...

I think the first condition is insensitive and not too important to get right,
it only allows work_mem to be exceeded by 2x, which maybe already happens for
multiple reasons, related to this thread and otherwise. It'd be fine to slap
on a factor of /2 or /4 or /8 there too.

The current patch doesn't unset growEnabled, since there's no point at which
the hashtable should grow without bound: if hash tables are *already* exceeding
work_mem by 2x as big, nbatches should be doubled.

Justin

Attachment Content-Type Size
v1-0001-account-for-size-of-BatchFile-structure-in-hashJo.patch text/x-diff 2.8 KB

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Adrian Klaver 2019-04-21 18:46:09 Re: Backup and Restore (pg_dump & pg_restore)
Previous Message Daulat Ram 2019-04-21 16:35:59 Backup and Restore (pg_dump & pg_restore)