Re: Is this a problem in GenericXLogFinish()?

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Alexander Lakhin <exclusion(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Jeff Davis <pgsql(at)j-davis(dot)com>, Heikki Linnakangas <hlinnaka(at)iki(dot)fi>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: Is this a problem in GenericXLogFinish()?
Date: 2024-04-10 07:57:43
Message-ID: ZhZGd7UdJA0jljAY@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 09, 2024 at 10:25:36AM +0000, Hayato Kuroda (Fujitsu) wrote:
>> On Fri, Apr 05, 2024 at 06:22:58AM +0000, Hayato Kuroda (Fujitsu) wrote:
>> I'm slightly annoyed by the fact that there is no check on
>> is_prim_bucket_same_wrt for buffer 1 in the BLK_NEEDS_REDO case to
>> show the symmetry between the insert and replay paths. Shouldn't
>> there be at least an assert for that in the branch when there are no
>> tuples (imagine an else to cover xldata->ntups == 0)? I mean between
>> just before updating the hash page's opaque area when
>> is_prev_bucket_same_wrt.
>
> Indeed, added an Assert() in else part. Was it same as your expectation?

Yep, close enough. Thanks to the insert path we know that there will
be no tuples if (is_prim_bucket_same_wrt || is_prev_bucket_same_wrt),
and the replay path where the assertion is added.

>> I've been thinking about ways to make such cases detectable in an
>> automated fashion. The best choice would be
>> verifyBackupPageConsistency(), just after RestoreBlockImage() on the
>> copy of the block image stored in WAL before applying the page masking
>> which would mask the LSN. A problem, unfortunately, is that we would
>> need to transfer the knowledge of REGBUF_NO_CHANGE as a new BKPIMAGE_*
>> flag so we would be able to track if the block rebuilt from the record
>> has the *same* LSN as the copy used for the consistency check. So
>> this edge consistency case would come at a cost, I am afraid, and the
>> 8 bits of bimg_info are precious :/
>
> I could not decide that the change has more benefit than its cost, so I did
> nothing for it.

It does not prevent doing an implementation that can be used for some
test validation in the context of this thread. Using this idea, I
have done the attached 0002. This is not something to merge into the
tree, of course, just a toy to:
- Validate the fix for the problem we know.
- More braodly, check if there are other areas covered by the main
regression test suite if there are other problems.

And from what I can see, applying 0002 without 0001 causes the
following test to fail as the standby chokes on a inconsistent LSN
with a FATAL because the LSN of the apply page coming from the primary
and the LSN saved in the page replayed don't match. Here is a command
to reproduce the problem:
cd src/test/recovery/ && \
PG_TEST_EXTRA=wal_consistency_checking \
PROVE_TESTS=t/027_stream_regress.pl make check

And then in the logs you'd see stuff like:
2024-04-10 16:52:21.844 JST [2437] FATAL: inconsistent page LSN
replayed 0/A7E5CD18 primary 0/A7E56348
2024-04-10 16:52:21.844 JST [2437] CONTEXT: WAL redo at 0/A7E59F68
for Hash/SQUEEZE_PAGE: prevblkno 28, nextblkno 4294967295, ntups 0,
is_primary T; blkref #1: rel 1663/16384/25434, blk 7 FPW; blkref #2:
rel 1663/16384/25434, blk 29 FPW; blkref #3: rel 1663/16384/25434, blk
28 FPW; blkref #5: rel 1663/16384/25434, blk 9 FPW; blkref #6: rel
1663/16384/25434, blk 0 FPW

I don't see other areas with a similar problem, at the extent of the
core regression tests, that is to say. That's my way to say that your
patch looks good to me and that I'm planning to apply it to fix the
issue.

This shows me a more interesting issue unrelated to this thread:
027_stream_regress.pl would be stuck if the standby finds an
inconsistent page under wal_consistency_checking. This needs to be
fixed before I'm able to create a buildfarm animal with this setup.
I'll spawn a new thread about that tomorrow.
--
Michael

Attachment Content-Type Size
v3-0001-Fix-inconsistency-with-replay-of-hash-squeeze-rec.patch text/x-diff 1.6 KB
v3-0002-Implement-consistency-check-with-clean-buffers-fo.patch text/x-diff 6.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sutou Kouhei 2024-04-10 08:16:26 Re: Make COPY format extendable: Extract COPY TO format implementations
Previous Message Peter Eisentraut 2024-04-10 07:31:16 Re: Cutting support for OpenSSL 1.0.1 and 1.0.2 in 17~?