Quick Links

Re: Unexpected behavior after OOM errors

From:	Heikki Linnakangas <hlinnaka(at)iki(dot)fi>
To:	Michael Paquier <michael(at)paquier(dot)xyz>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>
Cc:	Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>
Subject:	Re: Unexpected behavior after OOM errors
Date:	2026-06-18 06:42:39
Message-ID:	7b3c2a03-eeef-456e-bab1-30389550233c@iki.fi
Views:	Whole Thread \| Raw Message \| Download mbox \| Resend email
Thread:
Lists:	pgsql-hackers

On 18/06/2026 07:37, Michael Paquier wrote:
> On Wed, Jun 17, 2026 at 02:27:25PM +0200, Matthias van de Meent wrote:
>> On Wed, 17 Jun 2026 at 08:00, Alexander Lakhin <exclusion(at)gmail(dot)com> wrote:
>>> 1) An issue in lookup_type_cache()
>>
>> I believe this is caused by partial subsystem initialization. Attached
>> patch 0001 should address this failure without causing the server to
>> restart on OOM.
>
> Hmm. I think that this is an ordering problem. We could make the
> callbacks be registered last, once we are sure that the two hash
> tables and the in-progress list have been initialized. I am not sure
> that this requires a new facility; it is also an advantage to keep the
> initialization sequence in a one code path, without an abstraction.
>
> RelIdToTypeIdCacheHash and RelIdToTypeIdCacheHash are in the
> TopMemoryContext, static to the process, so we could just check them
> for NULL-ness to make the initialization repeatable. That gives me
> the attached v2. Reusing Alexander's randomness trick, that looks
> stable here.

Yeah, this can be solved by ordering. It's a bit fiddly though. I don't
know about Matthias's proposal either, but it'd be nice to have a less
fiddly system for these.

One idea is to have something similar to
START_CRIT_SECTION()/END_CRIT_SECTION(), but instead of promoting the
ERROR to a PANIC, promote it to FATAL. That way, if any of these
one-time allocations fail, the backend exits. If you're so
memory-starved that you cannot even initialize the type cache, you won't
be able to do anything useful with the connection anyway.

Another idea is that instead of having these be singletons in the type
cache, initialized on first use, move it to a new TypeCacheInitialize()
function that is always called at backend startup, like
RelationCacheInitialize(). If an allocation fails at that stage, the
backend will just exit. I think that's my favorite alternative so far.

BTW, I'm surprised we create the hash tables are created in
TopMemoryContext rather than CacheMemoryContext...

- Heikki

In response to

Re: Unexpected behavior after OOM errors at 2026-06-18 04:37:34 from Michael Paquier

Browse pgsql-hackers by date

	From	Date	Subject
Next Message	cca5507	2026-06-18 06:44:26	Re: Handle concurrent drop when doing whole database vacuum
Previous Message	Nisha Moond	2026-06-18 06:35:37	Re: Support EXCEPT for TABLES IN SCHEMA publications