Re: PG18 GIN parallel index build crash - invalid memory alloc request size

From: Tomas Vondra <tomas(at)vondra(dot)me>
To: Gregory Smith <gregsmithpgsql(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: PG18 GIN parallel index build crash - invalid memory alloc request size
Date: 2025-10-29 00:05:00
Message-ID: 1dc13d05-3afc-48e2-914b-d751d6f68457@vondra.me
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 10/28/25 21:54, Gregory Smith wrote:
> On Sun, Oct 26, 2025 at 5:52 PM Tomas Vondra <tomas(at)vondra(dot)me
> <mailto:tomas(at)vondra(dot)me>> wrote:
>
> I like (a) more, because it's more consistent with how I understand
> m_w_m. It's weird
> to say "use up to 20GB of memory" and then the system overrides that
> with "1GB". 
> I don't think it affects performance, though.
>
>
> There wasn't really that much gain from 1GB -> 20GB, I was using that
> setting for QA purposes more than measured performance.  During the
> early parts of an OSM build, you need to have a big Node Cache to hit
> max speed, 1/2 or more of a ~90GB file.  Once that part finishes,
> the 45GB+ cache block frees up and index building starts.  I just looked
> at how much was just freed and thought "ehhh...split it in half and
> maybe 20GB maintenance mem?"  Results seemed a little better than the
> 1GB setting I started at, so I've ran with that 20GB setting since.  
>
> That was back in PG14 and so many bottlenecks have moved around.  Since
> reporting this bug I've done a set of PG18 tests with m_w_m=256MB, and
> one of them just broke my previous record time running PG17.  So even
> that size setting seems fine.
>

Right, that matches my observations from testing the fixes.

I'd attribute this to caching effects when the accumulated GIN entries
fit into L3.

> I also wonder how far are we from hitting the uint32 limits. FAICS with
> m_w_m=24GB we might end up with too many elements, even with serial
> index builds. It'd have to be a quite weird data set, though.
>
>
> Since I'm starting to doubt I ever really needed even 20GB, I wouldn't
> stress about supporting that much being important.  I'll see if I can
> trigger an overflow with a test case though, maybe it's worth protecting
> against even if it's not a functional setting.
>

Yeah, I definitely want to protect against this. I believe similar
failures can happen even with much lower m_w_m values (possibly ~2-3GB),
although only with weird/skewed data sets. AFAICS a constant
single-element array would trigger this, but I haven't tested that.

Serial builds can fail with large maintenance_work_mem too, like this:

ERROR: posting list is too long
HINT: Reduce "maintenance_work_mem".

but it's deterministic, and it's actually a proper error message, not
just some weird "invalid alloc size".

Attached is a v3 of the patch series. 0001 and 0002 were already posted,
and I believe either of those would address the issue. 0003 is more of
an optimization, further reducing the memory usage.

I'm putting this through additional testing, which takes time. But it
seems there's still some loose end in 0001, as I just got the "invalid
alloc request" failure with it applied ... I'll take a look tomorrow.

regards

--
Tomas Vondra

Attachment Content-Type Size
v3-0001-Allow-parallel-GIN-builds-to-allocate-large-chunk.patch text/x-patch 2.7 KB
v3-0002-Split-TID-lists-during-parallel-GIN-build.patch text/x-patch 4.2 KB
v3-0003-Trim-TIDs-during-parallel-GIN-builds-more-eagerly.patch text/x-patch 6.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Jeff Davis 2025-10-29 00:19:50 Re: Remaining dependency on setlocale()
Previous Message Jelte Fennema-Nio 2025-10-28 23:44:00 Re: Add uuid_to_base32hex() and base32hex_to_uuid() built-in functions