Re: A failure in t/038_save_logical_slots_shutdown.pl

From: vignesh C <vignesh21(at)gmail(dot)com>
To: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
Cc: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: A failure in t/038_save_logical_slots_shutdown.pl
Date: 2024-01-11 11:05:03
Message-ID: CALDaNm1Kr0FWk9K4E3zht_CPjPz4wxMU6b=LvJtHW6jYGO1fdQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, 10 Jan 2024 at 18:37, vignesh C <vignesh21(at)gmail(dot)com> wrote:
>
> On Wed, 10 Jan 2024 at 14:08, Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > Hi,
> >
> > I've been observing a failure in t/038_save_logical_slots_shutdown.pl
> > of late on my developer system:
> >
> > t/038_save_logical_slots_shutdown.pl .. 1/?
> > # Failed test 'Check that the slot's confirmed_flush LSN is the same
> > as the latest_checkpoint location'
> > # at t/038_save_logical_slots_shutdown.pl line 35.
> > # Looks like you failed 1 test of 2.
> > t/038_save_logical_slots_shutdown.pl .. Dubious, test returned 1
> > (wstat 256, 0x100)
> > Failed 1/2 subtests
> >
> > I did a quick analysis of the failure and commit
> > https://github.com/postgres/postgres/commit/e0b2eed047df9045664da6f724cb42c10f8b12f0
> > that introduced this test. I think the issue is that the slot's
> > confirmed_flush LSN (0/1508000) and shutdown checkpoint LSN
> > (0/1508018) are not the same:
> >
> > tmp_check/log/038_save_logical_slots_shutdown_pub.log:
> >
> > 2024-01-10 07:55:44.539 UTC [57621] sub LOG: starting logical
> > decoding for slot "sub"
> > 2024-01-10 07:55:44.539 UTC [57621] sub DETAIL: Streaming
> > transactions committing after 0/1508000, reading WAL from 0/1507FC8.
> > 2024-01-10 07:55:44.539 UTC [57621] sub STATEMENT: START_REPLICATION
> > SLOT "sub" LOGICAL 0/0 (proto_version '4', origin 'any',
> > publication_names '"pub"')
> >
> > ubuntu:~/postgres$ pg17/bin/pg_controldata -D
> > src/test/recovery/tmp_check/t_038_save_logical_slots_shutdown_pub_data/pgdata/
> > Database cluster state: in production
> > pg_control last modified: Wed Jan 10 07:55:44 2024
> > Latest checkpoint location: 0/1508018
> > Latest checkpoint's REDO location: 0/1508018
> >
> > But the tests added by t/038_save_logical_slots_shutdown.pl expects
> > both LSNs to be same:
> >
> > sub compare_confirmed_flush
> > {
> > # Is it same as the value read from log?
> > ok( $latest_checkpoint eq $confirmed_flush_from_log,
> > "Check that the slot's confirmed_flush LSN is the same as the
> > latest_checkpoint location"
> > );
> >
> > I suspect that it's quite not right to expect the slot's
> > confirmed_flush and latest checkpoint location to be same in the test.
> > This is because the shutdown checkpoint gets an LSN that's greater
> > than the slot's confirmed_flush LSN - see the shutdown checkpoint
> > record getting inserted into WAL after the slot is marked dirty in
> > CheckPointReplicationSlots().
> >
> > With this analysis in mind, I think the tests need to do something
> > like the following:
> >
> > diff --git a/src/test/recovery/t/038_save_logical_slots_shutdown.pl
> > b/src/test/recovery/t/038_save_logical_slots_shut
> > down.pl
> > index 5a4f5dc1d4..d49e6014fc 100644
> > --- a/src/test/recovery/t/038_save_logical_slots_shutdown.pl
> > +++ b/src/test/recovery/t/038_save_logical_slots_shutdown.pl
> > @@ -32,7 +32,7 @@ sub compare_confirmed_flush
> > unless defined($latest_checkpoint);
> >
> > # Is it same as the value read from log?
> > - ok( $latest_checkpoint eq $confirmed_flush_from_log,
> > + ok( $latest_checkpoint ge $confirmed_flush_from_log,
> > "Check that the slot's confirmed_flush LSN is the same
> > as the latest_checkpoint location"
> > );
> >
> > Thoughts?
>
> I got the log files from Bharath offline. Thanks Bharath for sharing
> the log files offline.
> The WAL record sequence is exactly the same in the failing test and
> tests which are passing.
> One observation in our case the confirmed flush lsn points exactly to
> shutdown checkpoint, but in the failing test the lsn pointed is
> invalid, pg_waldump says that address is invalid and skips about 24
> bytes and then sees a valid record
>
> Passing case confirm flush lsn(0/150D158) from my machine:
> pg_waldump 000000010000000000000001 -s 0/150D158
> rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn:
> 0/0150D158, prev 0/0150D120, desc: CHECKPOINT_SHUTDOWN redo 0/150D158;
> tli 1; prev tli 1; fpw true; xid 0:739; oid 16388; multi 1; offset 0;
> oldest xid 728 in DB 1; oldest multi 1 in DB 1; oldest/newest commit
> timestamp xid: 0/0; oldest running xid 0; shutdown
>
> Failing case confirm flush lsn( 0/1508000) from failing tests log file:
> pg_waldump 000000010000000000000001 -s 0/1508000
> pg_waldump: first record is after 0/1508000, at 0/1508018, skipping
> over 24 bytes
> rmgr: XLOG len (rec/tot): 114/ 114, tx: 0, lsn:
> 0/01508018, prev 0/01507FC8, desc: CHECKPOINT_SHUTDOWN redo 0/1508018;
> tli 1; prev tli 1; fpw true; xid 0:739; oid 16388; multi 1; offset 0;
> oldest xid 728 in DB 1; oldest multi 1 in DB 1; oldest/newest commit
> timestamp xid: 0/0; oldest running xid 0; shutdown
>
> I'm still not sure why in this case, it is not exactly pointing to a
> valid WAL record, it has to skip 24 bytes to find the valid checkpoint
> shutdown record.
> I will investigate this further and share the analysis.

On further analysis, it was found that in the failing test,
CHECKPOINT_SHUTDOWN was started in a new page, so there was the WAL
page header present just before the CHECKPOINT_SHUTDOWN which was
causing the failure. We could alternatively reproduce the issue by
switching the WAL file before restarting the server like in the
attached test change patch.
There are a couple of ways to fix this issue a) one by switching the
WAL before the insertion of records so that the CHECKPOINT_SHUTDOWN
does not get inserted in a new page as in the attached test_fix.patch
b) by using pg_walinspect to check that the next WAL record is
CHECKPOINT_SHUTDOWN. I have to try this approach.

Thanks to Bharath and Kuroda-san for offline discussions and helping
in getting to the root cause.

Thoughts?

Regards,
Vignesh

Attachment Content-Type Size
test_change_reproduce_issue.patch text/x-patch 576 bytes
test_fix.patch text/x-patch 1.1 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message vignesh C 2024-01-11 11:07:48 Re: Clean up some signal usage mainly related to Windows
Previous Message Amit Kapila 2024-01-11 10:52:56 Re: Synchronizing slots from primary to standby