From: | Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> |
---|---|
To: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
Cc: | shveta malik <shveta(dot)malik(at)gmail(dot)com>, "Hayato Kuroda (Fujitsu)" <kuroda(dot)hayato(at)fujitsu(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date: | 2025-08-21 10:50:32 |
Message-ID: | CANhcyEVAaaGWYLF-LL+P_kE4AmVf3u9v19+mwRtag-w=hAjooA@mail.gmail.com |
Views: | Whole Thread | Raw Message | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Fri, 15 Aug 2025 at 04:38, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
>
> On Tue, Aug 12, 2025 at 1:26 AM Shlok Kyal <shlok(dot)kyal(dot)oss(at)gmail(dot)com> wrote:
> >
> >
> > Hi Sawada-san,
> >
> > I have reviewed the patch and have few comments:
>
> Thank you for reviewing the patch!
>
> >
> > 1. There are some spelling mistakes in logicaldecoding.sgml
> > + and requires waiting for any concurrent transactions to finish, ensureing
> > + system-wide conistency. Conversely, when the last logical replication slot
> > ensureing -> ensuring
> > conistency -> consistency
>
> Will fix.
>
> >
> > 2. In publicationcmds.c:
> > + errmsg("logical decoding needs to be enabled to
> > publish logical changes"),
> > + errhint("Set \"wal_level\" >= \"logical\" or create a
> > logical replication slot with \"replica\" WAL level before creating
> > subscriptions.")));
> >
> > Should we use something like this for errhint ?
> > errhint("Set \"wal_level\" >= \"logical\" or create a logical
> > replication slot before creating subscriptions when \"wal_level\" =
> > \"replica\".")));
>
> The current message is:
>
> Set "wal_level" >= "logical" or create a logical replication slot with
> "replica" WAL level before creating subscriptions.
>
> whereas the proposed message is:
>
> Set "wal_level" >= "logical" or create a logical replication slot
> before creating subscriptions when "wal_level" = "replica".
>
> I don't see a big difference between the two sentences but your point
> is to use 'when "wal_level" = "replica"' instead of 'with "replica"
> WAL level'?
>
Yes. After reading the error messages again, I also think there is no
big difference and I am fine with both.
> >
> >
> > 3. In logical.c:
> > + errmsg("logical decoding needs to be enabled on the primary"),
> > + errhint("Set \"wal_level\" >= \"logical\" or create
> > at least one logical slot with \"replica\" WAL level on the
> > primary.")));
> >
> > Should we change the errhint message as below?
> > errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > logical slot on the primary when \"wal_level\" = \"replica\".")));
> >
> > 4. In slotsync.c:
> > + errmsg("replication slot synchronization requires
> > logical decoding to be enabled"),
> > + errhint("Set \"wal_level\" >= \"logical\" or create at
> > least one logical slot with \"replica\" WAL level on the primary "));
> >
> > Should we change the errhint message as below?
> > errhint("Set \"wal_level\" >= \"logical\" or create at least one
> > logical slot on the primary when \"wal_level\" = \"replica\".")));
>
> Please see the above question.
>
> >
> > ---------
> >
> > I have also tested the patch with creating multiple permanent/
> > temporary slots in concurrent sessions and I did not find any issue. I
> > also tested this patch with a physical replication setup.
> > I have a doubt in this case:
> > 1. Suppose we have a physical replication setup. wal_level on primary
> > is set to 'replica'
> > 2. Now we try to create a logical slot on standby, an error is thrown
> > as wal_level is set to 'replica' as primary does not have any logical
> > slot
> > 3. Now we create a temporary logical slot on primary,
> > effective_wal_level is set to logical.
> > 4. Now we can create slots on standby as effective_wal_level is logical.
> > 5. Now we exit the session of the primary server. The temporary slot
> > is dropped. This will invalidate the slots on standby as the
> > effective_wal_level will be set to 'replica'.
> > So we can say that indirectly a temporary slot on primary can control
> > the behaviour of permanent slots on standbys.
> >
> > I checked this behaviour in HEAD. In HEAD also the behaviour is the
> > same. If we change the wal_level on primary from 'logical' to
> > 'replica', all slots are invalidated on the standby.
> > With patch this behaviour can be indirectly controlled by a temporary
> > slot. Is it fine? Thoughts?
>
> Your understanding is correct. I've discussed whether we need a way to
> keep auto-increased 'logical' WAL level on the primary when standbys
> have logical slots. You mentioned temporary logical slots cases but I
> think the same is true for the case where users accidentally drop the
> last logical slot.
>
> My understanding is that it's fine that logical decoding availability
> on standbys is controlled by primary's logical slots (including temp
> slots) presence. This essentially is the same behavior as the current
> one and users who are concerned about indirectly invalidating the
> logical slots on standbys can set wal_level to 'logical' on the
> primary. It's a separate discussion (and patch) whether we need to
> provide a way for users to keep auto-increased 'logical' WAL level on
> the primary.
>
Thanks for the explanation. I agree with you.
Also,
I did some testing with pg_createsubscriber and it is working fine. I
have some more comments:
1. in 040_pg_createsubscriber.pl we have:
# Check some unmet conditions on node P
$node_p->append_conf(
'postgresql.conf', q{
wal_level = replica
max_replication_slots = 1
max_wal_senders = 1
max_worker_processes = 2
});
Comment says: "Check some unmet conditions on node P". But with this
patch "wal_level = replica", will be a valid configuration, so it
will be contradictory to comment. Should we remove "wal_level =
replica" from the append_conf?
2. If we plan to change the above then we should also remove
"wal_level = logical" in the following:
# Restore default settings here but only apply it after testing standby. Some
# standby settings should not be a lower setting than on the primary.
$node_p->append_conf(
'postgresql.conf', q{
wal_level = logical
max_replication_slots = 10
max_wal_senders = 10
max_worker_processes = 8
});
Thanks,
Shlok Kyal
From | Date | Subject | |
---|---|---|---|
Next Message | Mircea Cadariu | 2025-08-21 11:08:37 | Re: Add os_page_num to pg_buffercache |
Previous Message | Amit Kapila | 2025-08-21 10:11:17 | Re: memory leak in logical WAL sender with pgoutput's cachectx |