From: | Kirill Reshke <reshkekirill(at)gmail(dot)com> |
---|---|
To: | Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> |
Cc: | Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Subject: | Re: VM corruption on standby |
Date: | 2025-08-19 17:00:53 |
Message-ID: | CALdSSPisWpkL+-_vS7B7vonX1XTC8aVkPhj3BBc2wtmuZ_a7cQ@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Tue, 19 Aug 2025 at 21:16, Yura Sokolov <y(dot)sokolov(at)postgrespro(dot)ru> wrote:
>
> That is not true.
> elog(PANIC) doesn't clear LWLocks. And XLogWrite, which is could be called
> from AdvanceXLInsertBuffer, may call elog(PANIC) from several places.
>
> It doesn't lead to any error, because usually postmaster is alive and it
> will kill -9 all its children if any one is died in critical section.
>
> So the problem is postmaster is already killed with SIGKILL by definition
> of the issue.
>
> Documentation says [0]:
> > If at all possible, do not use SIGKILL to kill the main postgres server.
> > Doing so will prevent postgres from freeing the system resources (e.g.,
> shared memory and semaphores) that it holds before terminating.
>
> Therefore if postmaster SIGKILL-ed, administrator already have to do some
> actions.
>
There are surely many cases when a system reaches the state which can
only be fixed by admin action.
The elog(PANIC) in the CRIT section is very rare (and very probably is
corruption already).
The simpler example is to kill-9 postmaster and then immediately
kill-9 someone who holds LWLock.
The problem is in pgv18 is that this state probability is much higher
due to the aforementioned commit. In can happen with almost
any OOM on highly loaded systems.
--
Best regards,
Kirill Reshke
From | Date | Subject | |
---|---|---|---|
Next Message | 章晨曦 | 2025-08-19 17:06:20 | Re: Performance issue on temporary relations |
Previous Message | Robert Haas | 2025-08-19 16:47:34 | RFC: extensible planner state |