RE: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>
To: "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, 'Masahiko Sawada' <sawada(dot)mshk(at)gmail(dot)com>
Cc: shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: RE: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-10-02 10:40:24
Message-ID: OSCPR01MB1496600E14B6AE808829D9B72F5E7A@OSCPR01MB14966.jpnprd01.prod.outlook.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

I found that after the fix I proposed [1], there is a possibility that effective_wal_level
could be logical after the promotion, even after the logical slots are dropped [2].
Steps:

0. Setup a streaming replication system, and both nodes had a replication slot
1. Attached to the startup and added a break at UpdateLogicalDecodingStatusEndOfRecovery
2. Sent a promotion request to the standby. Startup would stop
3. Established a connection to the standby.
4. Attached to the backend and added a break at ReplicationSlotDrop
5. Tried to drop the replication slot on the standby. Backend would stop
6. Moved the startup till WaitForProcSignalBarrier(). Note that allow_status_change
was still off.
7. Detached from the backend process.
8. Detached from the startup process.

This can happen because UpdateLogicalDecodingStatusEndOfRecovery() decided to
keep wal_level logical, and upcoming DisableLogicalDecodingIfNecessary() cannot
disable it. allow_status_change should be true for the case.

I considered an approach not to release lock while waiting the ProcSignal, but
other processes cannot not read and update xlog_logical_info.

[1]: https://www.postgresql.org/message-id/OSCPR01MB14966B8F6F728F3FB4B05BFDBF5E7A%40OSCPR01MB14966.jpnprd01.prod.outlook.com
[2]
```
postgres=# SELECT pg_is_in_recovery();
pg_is_in_recovery
-------------------
f
(1 row)

postgres=# SHOW effective_wal_level ;
effective_wal_level
---------------------
logical
(1 row)

postgres=# SELECT count(*) FROM pg_replication_slots ;
count
-------
0
(1 row)
```

Best regards,
Hayato Kuroda
FUJITSU LIMITED

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Daniel Verite 2025-10-02 11:11:13 Re: Problem in 'ORDER BY' of a column using a created collation?
Previous Message Daniel Gustafsson 2025-10-02 10:00:07 Re: [PATCH] Add tests for Bitmapset