Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash

From: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>
To: James Coleman <jtc331(at)gmail(dot)com>
Cc: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>, PostgreSQL mailing lists <pgsql-bugs(at)lists(dot)postgresql(dot)org>
Subject: Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
Date: 2019-11-10 21:08:58
Message-ID: CA+hUKGLyafKXBMFqZCSeYikPbdYURbwr+jP6TAy8sY-8LO0V+Q@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

I think I see what's happening: we're running out of hash bits.

> Buckets: 4194304 (originally 4194304) Batches: 32768 (originally 4096) Memory Usage: 344448kB

Here it's using the lower 22 bits for the bucket number, and started
out using 12 bits for the batch (!), and increased that until it got
to 15 (!!). After using 22 bits for the bucket, there are only 10
bits left, so all the tuples go into the lower 1024 batches.

I'm not sure how exactly this leads to wildly varying numbers of
repartioning cycles (the above-quoted example did it 3 times, the
version that crashed into MaxAllocSize did it ~10 times).

Besides switching to 64 bit hashes so that we don't run out of
information (clearly a good idea), what other options do we have? (1)
We could disable repartitioning (give up on work_mem) after we've run
out of bits; this will eat more memory than it should. (2) You could
start stealing bucket bits; this will eat more CPU than it should,
because you'd effectively have fewer active buckets (tuples will
concentrated on the bits you didn't steal).

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Tomas Vondra 2019-11-10 21:23:52 Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash
Previous Message James Coleman 2019-11-10 19:24:10 Re: BUG #16104: Invalid DSA Memory Alloc Request in Parallel Hash