From: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, shveta malik <shveta(dot)malik(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date: | 2025-09-09 06:22:41 |
Message-ID: | CAA4eK1JXYcPrJ+pvzsDWHw6shyzWinL0yQNwythaqh75E4QkXA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Mon, Sep 8, 2025 at 11:22 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Fri, Sep 5, 2025 at 9:12 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> >
> > On Sat, Sep 6, 2025 at 3:58 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > >
> > > On Tue, Sep 2, 2025 at 5:12 AM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> > > >
> > > >
> > > > I tested the behaviour with HEAD and with Patch. And I confirmed the
> > > > change in behaviour between HEAD and Patch
> > > >
> > > > Suppose we have a primary and a standby with wal_level = logical and
> > > > guc parameters to enable slot sync worker are set accordingly. A slot
> > > > sync worker will be running.
> > > > Now we change the value of wal_level for primary to replica. And
> > > > restart the primary server
> > > >
> > > > With HEAD, during restart the existing sync_slot_worker will exit with:
> > > > 2025-09-02 11:49:08.846 IST [3877882] ERROR: synchronization worker
> > > > "" could not connect to the primary server: connection to server at
> > > > "localhost" (127.0.0.1), port 5432 failed: Connection refused
> > > > Is the server running on that host and accepting TCP/IP connections?
> > > > 2025-09-02 11:49:11.380 IST [3877885] FATAL: streaming replication
> > > > receiver "walreceiver" could not connect to the primary server:
> > > > connection to server at "localhost" (127.0.0.1), port 5432 failed:
> > > > Connection refused
> > > > Is the server running on that host and accepting TCP/IP connections?
> > > >
> > > > and after the restart of the primary server, slot sync worker will
> > > > restart and it is able to connect to the primary.
> > > >
> > > > With Patch, during restart the existing sync_slot_worker will exit.
> > > > But after the restart of the primary server, slot sync worker cannot
> > > > start and we can see following log:
> > > > 2025-09-02 12:44:51.497 IST [3947520] LOG: replication slot
> > > > synchronization worker is shutting down on receiving SIGINT
> > > > 2025-09-02 12:44:51.498 IST [3943504] LOG: replication slot
> > > > synchronization requires logical decoding to be enabled
> > > > 2025-09-02 12:44:51.498 IST [3943504] HINT: To enable logical
> > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > one logical slot when "wal_level" = "replica".
> > > > 2025-09-02 12:45:51.537 IST [3943504] LOG: replication slot
> > > > synchronization requires logical decoding to be enabled
> > > > 2025-09-02 12:45:51.537 IST [3943504] HINT: To enable logical
> > > > decoding on primary, set "wal_level" >= "logical" or create at least
> > > > one logical slot when "wal_level" = "replica".
> > > >
> > > > So, with HEAD, after we restart the primary server with 'wal_level =
> > > > replica', the slot sync worker can restart and connect to the primary
> > > > but with patch it cannot start after restart due to the check in
> > > > ValidateSlotSyncParams.
> > >
> > > But the slotsync worker is launched again once logical decoding is
> > > enabled, no? I'm not sure that we want to launch the slotsync worker
> > > also when we know logical decoding is not enabled.
> > >
> >
> > Why in the first place the logical_decoding enabled check has failed
> > because IIUC, the wal_level on standby is still 'logical'?
>
> This is because logical decoding on standbys can be used only when the
> standby's effective_wal_level is 'logical', which also means the
> primary's effective_wal_level is 'logical' too. This behavior is
> mostly the same as today; logical decoding on standbys can be used
> only when both the primary and the standbys set wal_level to
> 'logical'. Even if standby's wal_level is set to logical, it doesn't
> mean that incoming WAL records are generated on the primary with the
> information required by logical decoding.
>
This is true but IIUC Shlok's report says that we are able to restart
server before patch and not after patch. Am, I missing something? If
not, then shouldn't this be fixed separately first?
--
With Regards,
Amit Kapila.
From | Date | Subject | |
---|---|---|---|
Next Message | vignesh C | 2025-09-09 06:22:53 | Re: Logical Replication of sequences |
Previous Message | shveta malik | 2025-09-09 06:21:38 | Clear logical slot's 'synced' flag on promotion of standby |