Re: BUG #17928: Standby fails to decode WAL on termination of primary

From: Alexander Lakhin <exclusion(at)gmail(dot)com>
To: Thomas Munro <thomas(dot)munro(at)gmail(dot)com>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Sergei Kornilov <sk(at)zsrv(dot)org>, Noah Misch <noah(at)leadboat(dot)com>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, pgsql-bugs(at)lists(dot)postgresql(dot)org, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Subject: Re: BUG #17928: Standby fails to decode WAL on termination of primary
Date: 2023-09-16 06:00:00
Message-ID: c3f123ca-03f7-b7ce-1e49-f8fee4c16545@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-bugs

Hello,

16.09.2023 03:20, Thomas Munro wrote:
> On Sat, Sep 16, 2023 at 12:03 PM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>>> [1] https://github.com/macdice/postgres/commits/fix-12
>> Hmm. What was the test that failed?
> $ make -s -C src/test/recovery/ check PROVE_TESTS=t/039*
> t/039_end_of_wal.pl .. 4/?
> # Failed test 'xlp_magic zero'
> # at t/039_end_of_wal.pl line 312.
>
> not ok 5 - xlp_magic zero
>
> Where the log should say "invalid magic number 0000" I see:
>
> 2023-09-16 12:13:07.331 NZST [156812] LOG: record with incorrect
> prev-link 0/16B60C0 at 0/16B6120
>
> It has to do with initial WAL position after initdb, because I get
> this only on Debian, on REL_12_STABLE (with the commit listed above on
> my public fix-12 branch) and only with --with-icu, but not without it,
> and I can't repro it on my other local OSes.

I tried to reproduce the failure on Debian 9, 10, 11, but not succeeded yet.
Though I got another error on Debian 9:
t/039_end_of_wal.pl .. Dubious, test returned 25 (wstat 6400, 0x1900)
No subtests run
...
cat src/test/recovery/tmp_check/log/regress_log_039_end_of_wal
could not find match in header access/xlog_internal.h

It looks like the construction "@{^CAPTURE}" used in scan_server_header()
is not supported by Perl 5.24, which is included in Debian stretch:
https://perldoc.perl.org/variables/@%7B%5ECAPTURE%7D
I replaced it with
@match = ($1);
and that worked for me.

Also, I observed that "wal_log_hints = on" in extra.config, which I use via
"TEMP_CONFIG=extra.config make check-world" makes the test fail too, though
check-world passes fine without the new test.
Maybe that's not an issue, and probably there are other parameters, which
might affect this test, but I'm somewhat confused by the fact that only this
test breaks with it.

Best regards,
Alexander

In response to

Responses

Browse pgsql-bugs by date

  From Date Subject
Next Message Michael Paquier 2023-09-16 08:27:28 Re: BUG #18070: Assertion failed when processing error from plpy's iterator
Previous Message Michael Paquier 2023-09-16 04:13:33 Re: BUG #17928: Standby fails to decode WAL on termination of primary