Re: Make mesage at end-of-recovery less scary.

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: aleksander(at)timescale(dot)com
Cc: pgsql-hackers(at)lists(dot)postgresql(dot)org, stark(dot)cfm(at)gmail(dot)com, alvherre(at)alvh(dot)no-ip(dot)org, pryzby(at)telsasoft(dot)com, jchampion(at)timescale(dot)com, andres(at)anarazel(dot)de, ashu(dot)coek88(at)gmail(dot)com, pashkin(dot)elfe(at)gmail(dot)com, michael(at)paquier(dot)xyz, bossartn(at)amazon(dot)com, david(at)pgmasters(dot)net, peter(dot)eisentraut(at)2ndquadrant(dot)com, jtc331(at)gmail(dot)com, robertmhaas(at)gmail(dot)com
Subject: Re: Make mesage at end-of-recovery less scary.
Date: 2023-07-20 05:02:17
Message-ID: 20230720.140217.263286895826895806.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 17 Jul 2023 15:20:30 +0300, Aleksander Alekseev <aleksander(at)timescale(dot)com> wrote in
> Thanks for working on this, it bugged me for a while. I noticed that
> cfbot is not happy with the patch so I rebased it.
> postgresql:pg_waldump test suite didn't pass after the rebase. I fixed
> it too. Other than that the patch LGTM so I'm not changing its status
> from "Ready for Committer".

Thanks for the rebasing.

> It looks like the patch was moved between the commitfests since
> 2020... If there is anything that may help merging it into PG17 please
> let me know.

This might be just too-much or there might be some doubt in this..

This change basically makes a zero-length record be considered as the
normal end of WAL.

The most controvorsial point I think in the design is the criteria for
an error condition. The assumption is that the WAL is sound if all
bytes following a complete record, up to the next page boundary, are
zeroed out. This is slightly narrower than the original criteria,
merely checking the next record is zero-length. Naturally, there
might be instances where that page has been blown out due to device
failure or some other reasons. Despite this, I believe it is
preferable rather than always issuing a warning (in the LOG level,
though) about a potential WAL corruption.

I've adjusted the condition for muting repeated log messages at the
same LSN, changing it from ==LOG to <=WARNING. This is simply a
consequence of following the change of "real" warnings from LOG to
WARNING. I believe this is acceptable even without considering
aforementioned change, as any single retriable (<ERROR) error at an
LSN should be sufficient to alert users about potential issues.

regards.

--
Kyotaro Horiguchi
NTT Open Source Software Center

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tatsuo Ishii 2023-07-20 05:15:13 Re: Row pattern recognition
Previous Message Bharath Rupireddy 2023-07-20 04:50:43 Re: Support worker_spi to execute the function dynamically.