Re: Add two missing tests in 035_standby_logical_decoding.pl

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, vignesh C <vignesh21(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add two missing tests in 035_standby_logical_decoding.pl
Date: 2023-04-28 08:54:00
Message-ID: 054defef-de15-53e4-6c5a-d16ccc21d4ba@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 4/28/23 5:55 AM, Amit Kapila wrote:
> On Wed, Apr 26, 2023 at 7:53 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> +# Get the restart_lsn from an invalidated slot
> +my $restart_lsn = $node_standby->safe_psql('postgres',
> + "SELECT restart_lsn from pg_replication_slots WHERE slot_name =
> 'vacuum_full_activeslot' and conflicting is true;"
> +);
> +
> +chomp($restart_lsn);
> +
> +# Get the WAL file name associated to this lsn on the primary
> +my $walfile_name = $node_primary->safe_psql('postgres',
> + "SELECT pg_walfile_name('$restart_lsn')");
> +
> +chomp($walfile_name);
> +
> +# Check the WAL file is still on the primary
> +ok(-f $node_primary->data_dir . '/pg_wal/' . $walfile_name,
> + "WAL file still on the primary");
>
> How is it guaranteed that the WAL file corresponding to the
> invalidated slot on standby will still be present on primary?

The slot(s) have been invalidated by the "vacuum full" test just above
this one. So I think the WAL we are looking for is the last one being used
by the primary. As no activity happened on it since the vacuum full it looks to
me that it should still be present.

But I may have missed something and maybe that's not guarantee that this WAL is still there in all the cases.
In that case I think it's better to remove this test (it does not provide added value here).

Test removed in V7 attached.

> Can you
> please explain the logic behind this test a bit more like how the WAL
> file switch helps you to achieve the purpose?
>

The idea was to generate enough "wal switch" on the primary to ensure
the WAL file has been removed.

I gave another thought on it and I think we can skip the test that the WAL is
not on the primary any more. That way, one "wal switch" seems to be enough
to see it removed on the standby.

It's done in V7.

V7 is not doing "extra tests" than necessary and I think it's probably better like this.

I can see V7 failing on "Cirrus CI / macOS - Ventura - Meson" only (other machines are not complaining).

It does fail on "invalidated logical slots do not lead to retaining WAL", see https://cirrus-ci.com/task/4518083541336064

I'm not sure why it is failing, any idea?

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v7-0001-Add-a-test-to-verify-that-invalidated-logical-slo.patch text/plain 2.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-04-28 09:01:11 Re: Perform streaming logical transactions by background workers and parallel apply
Previous Message Daniel Gustafsson 2023-04-28 08:49:45 Re: Testing autovacuum wraparound (including failsafe)