Re: Add two missing tests in 035_standby_logical_decoding.pl

From: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: vignesh C <vignesh21(at)gmail(dot)com>, "Yu Shi (Fujitsu)" <shiy(dot)fnst(at)fujitsu(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Add two missing tests in 035_standby_logical_decoding.pl
Date: 2023-05-05 10:32:33
Message-ID: 2c544b9f-419d-d18e-47be-f3d6f3d24e4e@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 5/5/23 11:29 AM, Amit Kapila wrote:
> On Fri, May 5, 2023 at 1:16 PM Drouvot, Bertrand
> <bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>>
>>
>> After multiple attempts, I got one failing one.
>>
>> Issue is that we expect this file to be removed:
>>
>> [07:24:27.261](0.899s) #WAL file is /Users/admin/pgsql/build/testrun/recovery/035_standby_logical_decoding/data/t_035_standby_logical_decoding_standby_data/pgdata/pg_wal/000000010000000000000003
>>
>> But the standby emits:
>>
>> 2023-05-05 07:24:27.216 UTC [17909][client backend] [035_standby_logical_decoding.pl][3/6:0] LOG: statement: checkpoint;
>> 2023-05-05 07:24:27.216 UTC [17745][checkpointer] LOG: restartpoint starting: immediate wait
>> 2023-05-05 07:24:27.259 UTC [17745][checkpointer] LOG: attempting to remove WAL segments older than log file 000000000000000000000002
>>
>> So it seems the test is not right (missing activity??), not sure why yet.
>>
>
> Can you try to print the value returned by
> XLogGetReplicationSlotMinimumLSN() in KeepLogSeg() on standby? Also,
> please try to print "attempting to remove WAL segments ..." on the
> primary. We can see, if by any chance some slot is holding us to
> remove the required WAL file.
>

I turned DEBUG2 on. We can also see on the primary:

2023-05-05 08:23:30.843 UTC [16833][checkpointer] LOCATION: CheckPointReplicationSlots, slot.c:1576
2023-05-05 08:23:30.844 UTC [16833][checkpointer] DEBUG: 00000: snapshot of 0+0 running transaction ids (lsn 0/40000D0 oldest xid 746 latest complete 745 next xid 746)
2023-05-05 08:23:30.844 UTC [16833][checkpointer] LOCATION: LogCurrentRunningXacts, standby.c:1377
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOG: 00000: BDT1 about to call RemoveOldXlogFiles in CreateCheckPoint
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION: CreateCheckPoint, xlog.c:6835
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOG: 00000: attempting to remove WAL segments older than log file 000000000000000000000002
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION: RemoveOldXlogFiles, xlog.c:3560
2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG: 00000: recycled write-ahead log file "000000010000000000000001"
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION: RemoveXlogFile, xlog.c:3708
2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG: 00000: recycled write-ahead log file "000000010000000000000002"
2023-05-05 08:23:30.845 UTC [16833][checkpointer] LOCATION: RemoveXlogFile, xlog.c:3708
2023-05-05 08:23:30.845 UTC [16833][checkpointer] DEBUG: 00000: SlruScanDirectory invoking callback on pg_subtrans/0000

So, 000000010000000000000003 is not removed on the primary.

It has been recycled on:

2023-05-05 08:23:38.605 UTC [16833][checkpointer] DEBUG: 00000: recycled write-ahead log file "000000010000000000000003"

Which is later than the test:

[08:23:31.931](0.000s) not ok 19 - invalidated logical slots do not lead to retaining WAL

FWIW, the failing test with DEBUG2 can be found there: https://cirrus-ci.com/task/5615316688961536

Regards,

--
Bertrand Drouvot
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-05-05 10:34:18 Re: Tables getting stuck at 's' state during logical replication
Previous Message Peter Smith 2023-05-05 10:09:33 Re: Support logical replication of DDLs