Re: Out of Memory errors are frustrating as heck!

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Gunther <raj(at)gusw(dot)net>
Cc: Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>, Jeff Janes <jeff(dot)janes(at)gmail(dot)com>, Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, pgsql-performance(at)lists(dot)postgresql(dot)org, Justin Pryzby <pryzby(at)telsasoft(dot)com>
Subject: Re: Out of Memory errors are frustrating as heck!
Date: 2019-04-16 15:30:19
Message-ID: 28937.1555428619@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-performance

Gunther <raj(at)gusw(dot)net> writes:
> And there we go:

> Breakpoint 6, AllocSetAlloc (context=0x29a6450, size=8) at aset.c:718
> 718 {
> (gdb) bt 8
> #0 AllocSetAlloc (context=0x29a6450, size=8) at aset.c:718
> #1 0x000000000084e8ad in palloc0 (size=size(at)entry=8) at mcxt.c:969
> #2 0x0000000000702b63 in makeBufFileCommon (nfiles=nfiles(at)entry=1) at buffile.c:119
> #3 0x0000000000702e4c in makeBufFile (firstfile=68225) at buffile.c:138
> #4 BufFileCreateTemp (interXact=interXact(at)entry=false) at buffile.c:201
> #5 0x000000000061060b in ExecHashJoinSaveTuple (tuple=0x2ba1018, hashvalue=<optimized out>, fileptr=0x6305b00) at nodeHashjoin.c:1220
> #6 0x000000000060d766 in ExecHashTableInsert (hashtable=hashtable(at)entry=0x2b50ad8, slot=<optimized out>, hashvalue=<optimized out>)
> at nodeHash.c:1663
> #7 0x0000000000610c8f in ExecHashJoinNewBatch (hjstate=0x29a6be0) at nodeHashjoin.c:1051

Hmm ... this matches up with a vague thought I had that for some reason
the hash join might be spawning a huge number of separate batches.
Each batch would have a couple of files with associated in-memory
state including an 8K I/O buffer, so you could account for the
"slow growth" behavior you're seeing by periodic decisions to
increase the number of batches.

You might try watching calls to ExecHashIncreaseNumBatches
and see if that theory holds water.

This could only happen with a very unfriendly distribution of the
hash keys, I think. There's a heuristic in there to shut off
growth of nbatch if we observe that we're making no progress at
all, but perhaps this is a skewed distribution that's not quite
skewed enough to trigger that.

regards, tom lane

In response to

Responses

Browse pgsql-performance by date

  From Date Subject
Next Message Daulat Ram 2019-04-16 17:41:31 Postgres backup & restore
Previous Message Gunther 2019-04-16 06:33:19 Re: Out of Memory errors are frustrating as heck!