From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Noah Misch <noah(at)leadboat(dot)com> |
Cc: | Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Vitaly Davydov <v(dot)davydov(at)postgrespro(dot)ru> |
Subject: | Re: Issues with 2PC at recovery: CLOG lookups and GlobalTransactionData |
Date: | 2025-05-22 01:30:40 |
Message-ID: | aC5-QIQsjnjylQyo@paquier.xyz |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, May 09, 2025 at 02:08:26PM +0900, Michael Paquier wrote:
> One extra thing that I have mentioned is that we could replace the
> CLOG safeguards based on what we know from the checkpoint record based
> on the oldest XID horizon of the checkpoint record and its next XID:
> - If we have a 2PC file older than the oldest XID horizon, we know
> that it should not exist.
> - If we have a 2PC file newer than the next XID, same, or we'll know
> about it while replaying.
>
> WDYT?
I have been going back and forth on this patch set for the last few
weeks. Please find a refreshed version as of the attached. These are
aimed only for v19 for the time being.
Patch 0001 is a refactoring of the existing 2PC code to integrate full
XIDs deeper into these code paths, easing the evaluation of the 2PC
files by not having to guess from which epoch they are. This is worth
improving on its own, and I'm hoing that this is acceptable as-is for
HEAD.
To summarize, patch 0002 provides fixes in the lines of what I have
mentioned upthread, based on the following lines:
- restoreTwoPhaseData() is in charge of checking the 2PC files in
pg_twophase/ at the beginning of recovery based on the oldest and
newest XID horizons retrieved from the checkpoint record, discarding
files seen as too new or too old.
- CLOG lookups are delayed at the end of recovery, handled by
RecoverPreparedTransactions() when we retore the 2PC data in shmem
filled during recovery. At this stage, checking for aborted and
committed transactions should be safe, the cluster is moved as ready
for WAL activity in the steps after this call.
- ProcessTwoPhaseBuffer() can never return NULL, does not include any
sanity checks anymore.
Thoughts or comments are welcome.
--
Michael
Attachment | Content-Type | Size |
---|---|---|
v3-0001-Integrate-more-FullTransactionIds-into-2PC-code.patch | text/x-diff | 43.1 KB |
v3-0002-Improve-handling-of-2PC-files-during-recovery.patch | text/x-diff | 16.4 KB |
From | Date | Subject | |
---|---|---|---|
Next Message | Sami Imseih | 2025-05-22 02:24:22 | Re: vacuum_multixact_failsafe_age doesn't account for MultiXact member exhaustion |
Previous Message | Sami Imseih | 2025-05-22 01:22:19 | Re: queryId constant squashing does not support prepared statements |