Re: Add two missing tests in 035_standby_logical_decoding.pl

From: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add two missing tests in 035_standby_logical_decoding.pl
Date: 2023-05-09 06:02:09
Message-ID: CAA4eK1+3gbYgnbQypVyZYf6t44HEqcodp0OUC1HwqpHUGAYhvw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, May 8, 2023 at 1:45 PM Drouvot, Bertrand
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> On 5/8/23 4:42 AM, Amit Kapila wrote:
> > On Sat, May 6, 2023 at 9:33 PM Drouvot, Bertrand
> > <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >>
> >> On 5/6/23 3:28 PM, Amit Kapila wrote:
> >>> On Sat, May 6, 2023 at 1:52 PM Drouvot, Bertrand
> >>> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
> >>
> >>> Next steps:
> >>> =========
> >>> 1. The first thing is we should verify this theory by adding some LOG
> >>> in KeepLogSeg() to see if the _logSegNo is reduced due to the value
> >>> returned by XLogGetReplicationSlotMinimumLSN().
> >>
> >> Yeah, will do that early next week.
>
> It's done with the following changes:
>
> https://github.com/bdrouvot/postgres/commit/79e1bd9ab429a22f876b9364eb8a0da2dacaaef7#diff-c1cb3ab2a19606390c1a7ed00ffe5a45531702ca5faf999d401c548f8951c65bL7454-R7514
>
> With that in place, there is one failing test here: https://cirrus-ci.com/task/5173216310722560
>
> Where we can see:
>
> 2023-05-08 07:42:56.301 UTC [18038][checkpointer] LOCATION: UpdateMinRecoveryPoint, xlog.c:2500
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG: 00000: CreateRestartPoint: After XLByteToSeg RedoRecPtr is 0/4000098, _logSegNo is 4
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION: CreateRestartPoint, xlog.c:7271
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG: 00000: KeepLogSeg: segno changed to 4 due to XLByteToSeg
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION: KeepLogSeg, xlog.c:7473
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG: 00000: KeepLogSeg: segno changed to 3 due to XLogGetReplicationSlotMinimumLSN()
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION: KeepLogSeg, xlog.c:7483
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG: 00000: CreateRestartPoint: After KeepLogSeg RedoRecPtr is 0/4000098, endptr is 0/4000148, _logSegNo is 3
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION: CreateRestartPoint, xlog.c:7284
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG: 00000: BDT1 about to call RemoveOldXlogFiles in CreateRestartPoint
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOCATION: CreateRestartPoint, xlog.c:7313
> 2023-05-08 07:42:56.302 UTC [18038][checkpointer] LOG: 00000: attempting to remove WAL segments older than log file 000000000000000000000002
>
> So the suspicion about XLogGetReplicationSlotMinimumLSN() was correct (_logSegNo moved from
> 4 to 3 due to XLogGetReplicationSlotMinimumLSN()).
>
> >> What about postponing the physical slot creation on the standby and the
> >> cascading standby node initialization after this test?
> >>
> >
> > Yeah, that is also possible. But, I have a few questions regarding
> > that: (a) There doesn't seem to be a physical slot on cascading
> > standby, if I am missing something, can you please point me to the
> > relevant part of the test?
>
> That's right. There is a physical slot only on the primary and on the standby.
>
> What I meant up-thread is to also postpone the cascading standby node initialization
> after this test (once the physical slot on the standby is created).
>
> Please find attached a proposal doing so.
>
> > (b) Which test is currently dependent on
> > the physical slot on standby?
>
> Not a test but the cascading standby initialization with the "primary_slot_name" parameter.
>
> Also, now I think that's better to have the physical slot on the standby + hsf set to on on the
> cascading standby (coming from the standby backup).
>
> Idea is to avoid any risk of logical slot invalidation on the cascading standby in the
> standby promotion test.
>

Why not initialize the cascading standby node just before the standby
promotion test: "Test standby promotion and logical decoding behavior
after the standby gets promoted."? That way we will avoid any unknown
side-effects of cascading standby and it will anyway look more logical
to initialize it where the test needs it.

--
With Regards,
Amit Kapila.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2023-05-09 06:25:29 Re: WAL Insertion Lock Improvements
Previous Message Michael Paquier 2023-05-09 05:10:20 Re: WAL Insertion Lock Improvements