From: | Aleksander Alekseev <aleksander(at)tigerdata(dot)com> |
---|---|
To: | PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Cc: | Andrey Borodin <x4mmm(at)yandex-team(dot)ru>, Melanie Plageman <melanieplageman(at)gmail(dot)com> |
Subject: | Re: VM corruption on standby |
Date: | 2025-08-07 15:17:17 |
Message-ID: | CAJ7c6TOtYagmAm+f4B3JEWoahG3bocoBNe1Gvdrjejo5MMMC1g@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
Hi,
> If my understanding is correct, we should make a WAL record with the
> XLH_LOCK_ALL_FROZEN_CLEARED flag *before* we modify the VM but within
> the same critical section [...]
>
> A draft patch is attached. It makes the test pass and doesn't seem to
> break any other tests.
>
> Thoughts?
In order not to forget - assuming I'm not wrong about the cause of the
issue, we might want to recheck the order of visibilitymap_* and XLog*
calls in the following functions too:
- heap_multi_insert
- heap_delete
- heap_update
- heap_lock_tuple
- heap_lock_updated_tuple_rec
By a quick look all named functions modify the VM before making a
corresponding WAL record. This can cause a similar issue:
1. VM modified
2. evicted asynchronously before logging
3. kill 9
4. different state of VM on primary and standby
From | Date | Subject | |
---|---|---|---|
Next Message | Ilia Evdokimov | 2025-08-07 15:23:15 | Re: stylesheet-html-common: only apply Bootstrap container classes in website build |
Previous Message | Xuneng Zhou | 2025-08-07 15:00:50 | Re: Implement waiting for wal lsn replay: reloaded |