Re: POC: enable logical decoding when wal_level = 'replica' without a server restart

From: Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
To: shveta malik <shveta(dot)malik(at)gmail(dot)com>
Cc: Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: POC: enable logical decoding when wal_level = 'replica' without a server restart
Date: 2025-08-22 22:20:49
Message-ID: CAD21AoCKziyjPsg0y6W0N9w1Ta16NXv-0xW0KU1HTehthSg1nQ@mail.gmail.com
Views: Whole Thread | Raw Message | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Aug 21, 2025 at 8:11 PM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
>
> On Thu, Aug 21, 2025 at 10:34 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Wed, Aug 20, 2025 at 3:11 AM shveta malik <shveta(dot)malik(at)gmail(dot)com> wrote:
> > >
> > > Please find a few comments:
> >
> > Thank you for reviewing the patch!
> >
> > >
> > > 1)
> > > ReplicationSlotsDropDBSlots:
> > > + bool dropped = false;
> > >
> > > We can name 'dropped ' as 'dropped_logical' similar to ReplicationSlotCleanup.
> >
> > I think we don't necessarily need to add 'logical' because this
> > function attempts to drop only logical slots unlike
> > ReplicationSlotCleanup().
>
> Okay, I see. I missed that point earlier.
>
> >
> > >
> > > 2)
> > > ReplicationSlotsDropDBSlots()
> > > +
> > > + if (dropped && nlogicalslots == 0)
> > > + DisableLogicalDecodingIfNecessary();
> > >
> > > I could not understand the need of 'nlogicalslots' condition here?
> > > Once we increment 'nlogicalslots', there is no way we can skip the
> > > loop without dropping the slot with the only exception of ERROR-ing
> > > out if active_pid is non NULL. So if the loop has completed and we
> > > have reached this sage, won't it essentially mean 'nlogicalslots' is 0
> > > in both cases: a) we actually dropped any slot; b) we did not find
> > > any slot to drop. Or am I missing something?
> >
> > I think I should have incremented nlogicalslots even for logical slots
> > on other databases. What I want to do here is to call
> > DisableLogicalDecodingIfNecessary() only when we have dropped at least
> > one logical slots and there is no logical slots on the whole database
> > cluster as a result. If we have logical slots only on the current
> > database, we eventually reach the above 'if' statement with
> > dropped=true and nlogicalslots=0. On the other hand, if we have
> > logical slots also on other databases, we reach there with
> > dropped=true and nlogicalslots>0, meaning we don't want to disable
> > logical decoding. Does it make sense?
> >
>
> Yes, it makes sense after incrementing 'nlogicalslots' even for other databases.
>
> > >
> > > Same is the case with ReplicationSlotCleanup().
> > >
> > > 3)
> > > Few typos:
> > >
> > > + /*
> > > + * Update shmem flags. We don't need to care about the order of setting
> > > + * global flag and writing the WAL record this case since writes are not
> > > + * allowed yet.
> > > + */
> > >
> > > this case --> in this case
> > >
> > > + * This is the authoritative value used by the all process to determine
> > >
> > > 'used by all the processes'
> >
> > Fixed.
> >
> > > 049_effective_wal_level.pl:
> > > 4)
> > >
> > > Few typos:
> > > +# Initialize standby2 ndoe form the backup 'my_backup'.
> > >
> > > ndoe form --> node from
> > >
> > > +# Test the race condition between the startup and logical decoding
> > > statuc change.
> > >
> > > statuc --> status
> >
> > Fixed.
> >
> > >
> > > 5)
> > > +# Promote the standby2 node that has one logical slot. So the logical decoding
> > > +# keeps enabled even after the promotion.
> > > +$standby2->promote;
> > > +test_wal_level($standby2, "replica|logical",
> > > + "effective_wal_level keeps 'logical' even after the promotion");
> > > +$standby2->safe_psql('postgres',
> > > + qq[select pg_create_logical_replication_slot('standby2_slot2', 'pgoutput')]
> > > +);
> > > +$standby2->stop;
> > >
> > > Do we need 'pg_create_logical_replication_slot' here?
> >
> > Yes, I put it to check if we can create a logical slot even after the
> > promotion. I've added the comment to explain it.
> >
>
> Okay, makes sense.
>
> > >
> > > 6)
> > >
> > > +test_wal_level($primary, "replica|replica",
> > > + "effective_wal_level got decreased to 'replica' on primary");
> > > +test_wal_level($standby3, "logical|replica",
> > > + "effective_wal_level got decreased to 'replica' on standby");
> > > +test_wal_level($cascade, "replica|replica",
> > > + "effective_wal_level got decreased to 'logical' on standby");
> > > +
> > >
> > > Last one should also say: decreased to 'replica' (instead of logical)
> >
> > Fixed.
> >
> > I've attached the updated patch.

I found that we don't need to expose LogicalDecodingCtlData in
logicalctl.h header file. I've updated some cosmetic changes including
that point.

I think the patch is getting pretty good shape and am aiming at
getting this patch committed during the September commitfest. Is there
any further tests and verifications we need? Of course further patch
reviews are also welcome.

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v9-0001-Enable-logical-decoding-dynamically-based-on-logi.patch application/octet-stream 88.8 KB

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Corey Huinker 2025-08-22 23:15:35 Re: vacuumdb --missing-stats-only and permission issue
Previous Message Jeff Davis 2025-08-22 21:48:07 [18+] improve upgrade pre-check for Unicode updates