Re: Out of Memory errors are frustrating as heck!

From: Gaetano Mendola <mendola(at)gmail(dot)com>
To: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Gunther <raj(at)gusw(dot)net>, pgsql-performance(at)lists(dot)postgresql(dot)org, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Out of Memory errors are frustrating as heck!
Date: 2019-04-22 08:07:52
Message-ID: CAJycT5rE2Ya9FRfpfWTs_CH8ohk6OZG6ZfXQ4Ma9Bp6UW9OS6g@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Batch splitting shouldn't be followed by a hash function change?

On Mon, Apr 22, 2019, 05:09 Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
wrote:

> On Sun, Apr 21, 2019 at 11:40:22AM -0500, Justin Pryzby wrote:
> >On Sun, Apr 21, 2019 at 10:36:43AM -0400, Tom Lane wrote:
> >> Jeff Janes <jeff(dot)janes(at)gmail(dot)com> writes:
> >> > The growEnabled stuff only prevents infinite loops. It doesn't
> prevent
> >> > extreme silliness.
> >>
> >> > If a single 32 bit hash value has enough tuples by itself to not fit
> in
> >> > work_mem, then it will keep splitting until that value is in a batch
> by
> >> > itself before shutting off
> >>
> >> I suspect, however, that we might be better off just taking the
> existence
> >> of the I/O buffers into account somehow while deciding whether it's
> worth
> >> growing further. That is, I'm imagining adding a second independent
> >> reason for shutting off growEnabled, along the lines of "increasing
> >> nbatch any further will require an unreasonable amount of buffer
> memory".
> >> The question then becomes how to define "unreasonable".
> >
> >On Sun, Apr 21, 2019 at 06:15:25PM +0200, Tomas Vondra wrote:
> >> I think the question the code needs to be asking is "If we double the
> >> number of batches, does the amount of memory we need drop?" And the
> >> memory needs to account both for the buffers and per-batch data.
> >>
> >> I don't think we can just stop increasing the number of batches when the
> >> memory for BufFile exceeds work_mem, because that entirely ignores the
> >> fact that by doing that we force the system to keep the per-batch stuff
> >> in memory (and that can be almost arbitrary amount).
> >...
> >> Of course, this just stops enforcing work_mem at some point, but it at
> >> least attempts to minimize the amount of memory used.
> >
> >This patch defines reasonable as "additional BatchFiles will not
> themselves
> >exceed work_mem; OR, exceeded work_mem already but additional BatchFiles
> are
> >going to save us RAM"...
> >
>
> OK.
>
> >I think the first condition is insensitive and not too important to get
> right,
> >it only allows work_mem to be exceeded by 2x, which maybe already happens
> for
> >multiple reasons, related to this thread and otherwise. It'd be fine to
> slap
> >on a factor of /2 or /4 or /8 there too.
> >
>
> TBH I'm not quite sure I understand all the conditions in the patch - it
> seems unnecessarily complicated. And I don't think it actually minimizes
> the amount of memory used for hash table + buffers, because it keeps the
> same spaceAllowed (which pushes nbatches up). At some point it actually
> makes to bump spaceAllowed and make larger batches instead of adding
> more batches, and the patch does not seem to do that.
>
> Also, the patch does this:
>
> if (hashtable->nbatch*sizeof(PGAlignedBlock) < hashtable->spaceAllowed)
> {
> ExecHashIncreaseNumBatches(hashtable);
> }
> else if (hashtable->spaceUsed/2 >= hashtable->spaceAllowed)
> {
> /* Exceeded spaceAllowed by 2x, so we'll save RAM by allowing
> nbatches to increase */
> /* I think this branch would be hit almost same as below branch */
> ExecHashIncreaseNumBatches(hashtable);
> }
> ...
>
> but the reasoning for the second branch seems wrong, because
>
> (spaceUsed/2 >= spaceAllowed)
>
> is not enough to guarantee that we actually save memory by doubling the
> number of batches. To do that, we need to make sure that
>
> (spaceUsed/2 >= hashtable->nbatch * sizeof(PGAlignedBlock))
>
> But that may not be true - it certainly is not guaranteed by not getting
> into the first branch.
>
> Consider an ideal example with uniform distribution:
>
> create table small (id bigint, val text);
> create table large (id bigint, val text);
>
> insert into large select 1000000000 * random(), md5(i::text)
> from generate_series(1, 700000000) s(i);
>
> insert into small select 1000000000 * random(), md5(i::text)
> from generate_series(1, 10000) s(i);
>
> vacuum analyze large;
> vacuum analyze small;
>
> update pg_class set (relpages, reltuples) = (1000000, 1)
> where relname = 'large';
>
> update pg_class set (relpages, reltuples) = (1, 1000000)
> where relname = 'small';
>
> set work_mem = '1MB';
>
> explain analyze select * from small join large using (id);
>
> A log after each call to ExecHashIncreaseNumBatches says this:
>
> nbatch=2 spaceUsed=463200 spaceAllowed=1048576 BufFile=16384
> nbatch=4 spaceUsed=463120 spaceAllowed=1048576 BufFile=32768
> nbatch=8 spaceUsed=457120 spaceAllowed=1048576 BufFile=65536
> nbatch=16 spaceUsed=458320 spaceAllowed=1048576 BufFile=131072
> nbatch=32 spaceUsed=457120 spaceAllowed=1048576 BufFile=262144
> nbatch=64 spaceUsed=459200 spaceAllowed=1048576 BufFile=524288
> nbatch=128 spaceUsed=455600 spaceAllowed=1048576 BufFile=1048576
> nbatch=256 spaceUsed=525120 spaceAllowed=1048576 BufFile=2097152
> nbatch=256 spaceUsed=2097200 spaceAllowed=1048576 BufFile=2097152
> nbatch=512 spaceUsed=2097200 spaceAllowed=1048576 BufFile=4194304
> nbatch=1024 spaceUsed=2097200 spaceAllowed=1048576 BufFile=8388608
> nbatch=2048 spaceUsed=2097200 spaceAllowed=1048576 BufFile=16777216
> nbatch=4096 spaceUsed=2097200 spaceAllowed=1048576 BufFile=33554432
> nbatch=8192 spaceUsed=2097200 spaceAllowed=1048576 BufFile=67108864
> nbatch=16384 spaceUsed=2097200 spaceAllowed=1048576 BufFile=134217728
>
> So we've succeeded in keeping spaceUsed below 2*spaceAllowed (which
> seems rather confusing, BTW), but we've allocated 128MB for BufFile. So
> about 130MB in total. With 16k batches.
>
> What I think might work better is the attached v2 of the patch, with a
> single top-level condition, comparing the combined memory usage
> (spaceUsed + BufFile) against spaceAllowed. But it also tweaks
> spaceAllowed once the size needed for BufFile gets over work_mem/3.
>
> And it behaves like this:
>
> nbatch=2 spaceUsed=458640 spaceAllowed=1048576 BufFile=16384
> nbatch=4 spaceUsed=455040 spaceAllowed=1048576 BufFile=32768
> nbatch=8 spaceUsed=440160 spaceAllowed=1048576 BufFile=65536
> nbatch=16 spaceUsed=426560 spaceAllowed=1048576 BufFile=131072
> nbatch=32 spaceUsed=393200 spaceAllowed=1048576 BufFile=262144
> nbatch=64 spaceUsed=329120 spaceAllowed=1572864 BufFile=524288
> nbatch=128 spaceUsed=455600 spaceAllowed=3145728 BufFile=1048576
> nbatch=256 spaceUsed=987440 spaceAllowed=6291456 BufFile=2097152
> nbatch=512 spaceUsed=2040560 spaceAllowed=12582912 BufFile=4194304
> nbatch=1024 spaceUsed=4114640 spaceAllowed=25165824 BufFile=8388608
> nbatch=2048 spaceUsed=8302880 spaceAllowed=50331648 BufFile=16777216
>
> So we end up with just 2k batches, using ~24MB of memory in total.
> That's because the spaceAllowed limit was bumped up instead of adding
> more and more batches.
>
> >The current patch doesn't unset growEnabled, since there's no point at
> which
> >the hashtable should grow without bound: if hash tables are *already*
> exceeding
> >work_mem by 2x as big, nbatches should be doubled.
> >
>
> Not sure. I guess it might be useful to re-evaluate the flag after a
> while - not necessarily by actually enabling it right away, but just
> checking if it'd move any tuples. Just disabling it once may be an issue
> when the input data is somehow correlated, which seems to be one of the
> issues with the data set discussed in this thread.
>
> regards
>
> --
> Tomas Vondra http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Tomas Vondra 2019-04-22 13:04:15 Re: Out of Memory errors are frustrating as heck!
Previous Message Tomas Vondra 2019-04-22 03:09:27 Re: Out of Memory errors are frustrating as heck!