| From: | Michael Paquier <michael(at)paquier(dot)xyz> |
|---|---|
| To: | Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com> |
| Cc: | Alexander Lakhin <exclusion(at)gmail(dot)com>, pgsql-hackers <pgsql-hackers(at)postgresql(dot)org> |
| Subject: | Re: Unexpected behavior after OOM errors |
| Date: | 2026-06-18 23:55:30 |
| Message-ID: | ajSFcksGRhNJEs5H@paquier.xyz |
| Views: | Whole Thread | Raw Message | Download mbox | Resend email |
| Thread: | |
| Lists: | pgsql-hackers |
On Thu, Jun 18, 2026 at 11:27:28AM +0200, Matthias van de Meent wrote:
> Each of the calls to
> CacheRegisterSyscacheCallback/CacheRegisterRelcacheCallback can throw
> an ERROR when all slots have been used. This would leave the typcache
> in an invalid state, so I think that must be wrapped in a critical
> section: neither syscache nor relcache has options to release
> callbacks, and we can't safely continue without the callbacks
> installed, so once an error is thrown here this backend can't ever be
> properly initialized. This is unlike OOMs, whose conditions for
> failure may (and often do) change as workloads change in other
> backends.
We don't ERROR when failing to register a syscache/relcache callback,
we FATAL if we reach one of the thresholds. Reaching these thresholds
points to me to a programming error anyway, so these should not matter
in the field. The OOM is a random pattern that can happen outside the
Postgres realm.
Just in case, I have planted a elog(FATAL) triggering randomly in the
middle of cache registration callback calls, and the typcache
inconsistency does not come in play with the shutdown sequence once
these trigger even if we have the tables set but not the callbacks.
As a whole, I tend to think that reordering the actions is a solution
good enough here.
> I think Heikki's suggestion for a FATAL critical section option would
> be a good alternative. It wouldn't always be sufficient, but would fix
> issues here.
That sounds like an interesting idea, potentially reusable for other
areas, but I'm not really convinced that we need to add this kind of
facility for the case dealt with here. To me, that's also where we
could use a TRY/CATCH block and call it a day. If others feel
differently about this matter, I'm fine to be outvoted.
--
Michael
| From | Date | Subject | |
|---|---|---|---|
| Next Message | jian he | 2026-06-19 00:07:53 | Re: Row pattern recognition |
| Previous Message | Tom Lane | 2026-06-18 23:35:14 | Re: PG20 Minimum Dependency Thread |