From: | Michael Paquier <michael(at)paquier(dot)xyz> |
---|---|
To: | Fujii Masao <masao(dot)fujii(at)gmail(dot)com> |
Cc: | Chao Li <li(dot)evan(dot)chao(at)gmail(dot)com>, Postgres hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org> |
Subject: | Re: Fix pg_waldump to exit cleanly at end of WAL |
Date: | 2025-09-03 02:47:06 |
Message-ID: | aLesKvM9QpvCVJd2@paquier.xyz |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Wed, Sep 03, 2025 at 09:11:15AM +0900, Fujii Masao wrote:
> Can pg_waldump really distinguish between the end of WAL and corruption?
I don't think you can really do that reliably, as some of the messages
marking the end of WAL could also be bumped into upon a corruption, as
far as I recall. We need the CRC record check to make the
distinction, which we cannot do at this stage because we don't have
the full record yet for the check.
Perhaps what's been posted on your thread [1] could be revisited for
the xlogreader because we are able to read the record headers more
reliably thanks to Thomas' work around bae868caf222, backtracking on
my previous take posted here, posted prior to this commit:
https://www.postgresql.org/message-id/ZadmUE-edk2Z4CQU@paquier.xyz
Discarding the error message when we read what we think is an
incorrect value for the first field in the record header (total record
length) means that we may lose some information that's actually legit
to know about, so the proposed patch is wrong. Tweaking xlogreader.c
to let its callers take the decision would be better, even if it puts
the cost of the decision to all the tools. One problem is that this
brings some complexity in xlogreader.c itself, which may not justify
bothering about all that.
(Note: that's likely a biased opinion as I am used to live with these
messages when running WAL record parsers, but I understand that for
newcomers these are confusing to read the first time as they can be
read as "my cluster is deeply broken and my WAL is corrupted".)
--
Michael
From | Date | Subject | |
---|---|---|---|
Next Message | Hayato Kuroda (Fujitsu) | 2025-09-03 03:11:12 | RE: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Previous Message | Julien Rouhaud | 2025-09-03 02:22:37 | Re: Update outdated references to SLRU ControlLock |