From: | Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> |
---|---|
To: | Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> |
Cc: | Bertrand Drouvot <bertranddrouvot(dot)pg(at)gmail(dot)com>, Ashutosh Bapat <ashutosh(dot)bapat(dot)oss(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org> |
Subject: | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |
Date: | 2025-05-10 07:35:15 |
Message-ID: | CAD21AoCYHqsKBw8mSTA6ZO5ODuaiHEUUnjk4-i18inLU702yEw@mail.gmail.com |
Views: | Raw Message | Whole Thread | Download mbox | Resend email |
Thread: | |
Lists: | pgsql-hackers |
On Sat, May 10, 2025 at 12:00 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Thu, May 8, 2025 at 1:06 AM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> >
> > On Tue, May 6, 2025 at 11:59 PM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > >
> > > On Thu, Apr 24, 2025 at 11:14 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > >
> > > > On Thu, Apr 24, 2025 at 5:30 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > >
> > > > > On Wed, Apr 23, 2025 at 9:35 PM Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com> wrote:
> > > > > >
> > > > > > On Wed, Apr 23, 2025 at 5:46 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
> > > > > > >
> > > > > > > BTW, did we consider the idea to automatically transition to 'logical'
> > > > > > > when the first logical slot is created and transition back to
> > > > > > > 'replica' when last logical slot gets dropped? I see some ideas around
> > > > > > > this last time we discussed this topic.
> > > > > >
> > > > > > Yes. Bertrand pointed out that a drawback is that the primary server
> > > > > > needs to create a logical slot in order to execute logical decoding on
> > > > > > the standbys[1].
> > > > > >
> > > > >
> > > > > True, but if we want to avoid that, we can still keep 'logical' as
> > > > > wal_level for the ease of users.
> > > >
> > > > I think we'd like to cover the use case like where users start with
> > > > 'replica' on the primary and execute logical decoding on the standby
> > > > without neither creating a logical slot on the primary nor restarting
> > > > the primary.
> > > >
> > >
> > > Okay, if we introduce a SIGHUP GUC like max_wal_level as you are
> > > proposing, the above requirement will be fulfilled, right?
> >
> > Right. Both the primary and the standby can increase WAL level to
> > 'logical' without server restart nor creating a logical slot.
> >
> > > The other
> > > way is by API pg_activate_logical_decoding().
> >
> > Yes. This approach would be simpler than the current proposal as we
> > don't need other new infrastructure such as executing a task in the
> > background.
> >
>
> Right, but to an extent, this is also similar to having a requirement
> of a logical slot on the primary. Now, it seems to me that the point
> you are trying to make is that to allow logical decoding on standby,
> it is okay to ask users to use pg_activate_logical_decoding() on
> primary, but it would be inconvenient to ask them to have a logical
> slot on primary instead. If my understanding is correct, then why do
> you think so? We recommend that users have a physical slot on primary
> and use it via primary_slot_name on standby to control resource
> removal, so why can't we ask them to have a logical slot on primary to
> allow logical decoding on standby?
I was thinking of a simple use case where users do logical decoding
from the physical standby. That is, the primary has a physical slot
and the standby uses it via primary_slot_name, and the subscriber
connects the standby server for logical replication with a logical
slot on the standby. In this case, IIUC we need to require users to
create a logical slot on the primary in order just to increase WAL
level to 'logical', but it doesn't make sense to me. No one is going
to use this logical slot and the primary ends up accumulating WALs.
> > However, we might want to note that wal_level value would
> > no longer show the actual runtime WAL level if the logical decoding is
> > activated via this API. Probably it's better to introduce a read-only
> > GUC, say runtime_wal_level, showing the actual WAL level.
> >
>
> Yeah, we need some way to show the correct value. In one of the
> previous emails on this thread, you mentioned that we can use
> show_hook to show the correct value. I see that show_in_hot_standby()
> uses in_memory value to show the correct value. Do you have something
> like that in your mind?
Yes.
> BTW, what is your idea to preserve the state to allow logical decoding
> across server restart when the user uses the API, do we want to
> persist the state in some way, if so, how? OTOH, if we use the idea to
> have a logical slot to allow decoding, then the presence of a logical
> slot can tell us whether we need to enable the new state to allow
> logical decoding after restart.
I vaguely thought of storing such information in the control file or a
checkpoint record along with wal_level value. But I've not seriously
considered how to implement this idea as I don't think it's not a good
user interface.
>
> > Also,
> > Ashutosh pointed out[1] before that cloud providers do not like
> > multiple ways of changing configuration esp. when they can not control
> > it. But I'm not sure this applies to the API as it's a SQL function
> > whose access privilege can be controlled.
> >
>
> By multiple ways, do we mean to say that one way for users would be to
> use the existing way (change wal_level to logical and restart server),
> and the other way would be to use the new API (or have a logical
> slot)?
Yes, that's my understanding of his comment.
> But won't similarly users have multiple ways to retain WAL for
> standby servers (either by using wal_keep_size or by having a
> primary_slot_name). The other example is that one can either manually
> change postgresql.conf file or use ALTER SYSTEM to change it, and then
> reloadthe config or restart the server for the change to take effect.
> There could be other similar examples as well if one tries to list all
> such possibilities.
True. I think allow_alter_system was introduced for users who don't
want to let their end users change the configuration via ALTER SYSTEM
command. Since the new API we're considering is an SQL function we
already have a way to control its access for such users.
> I feel one should be concerned if we are trying to
> make both wal_level GUC as SIGHUP, and also try to provide an API to
> enable logical decoding.
Agreed.
>
> > > >
> > >
> > > That makes sense. If we are using an API like
> > > pg_activate_*/pg_deactivate_*, then why add an additional dependency
> > > on the slots?
> >
> > I thought that we need to remember how logical decoding got enabled
> > because otherwise even if we enable logical decoding using the API,
> > it's disabled to 'replica' if all logical slots get removed. So the
> > idea I mentioned above is that we somehow prevent logical decoding
> > from being disabled even if all logical slots are removed. If we're
> > using only these APIs to enable/disable logical decoding, we don't
> > need to add a dependency on the slots, although we probably want to
> > disallow disabling logical decoding if there is at least one active
> > logical slot.
> >
>
> Yeah, this is a detail that should be discussed once we finalize the
> API to enable logical decoding on both primary and standby without
> restarting the primary server.
Agreed.
Another approach we might need to consider is to convert wal_level to
a SIGHUP parameter. While I mentioned that supporting all combinations
of wal_level value changes might make less sense for its complexity, I
think this is the most straightforward approach and interface. So it
might be worth trying to implement this approach to figure out the
actual complexity.
Regards,
--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com
From | Date | Subject | |
---|---|---|---|
Next Message | Stepan Neretin | 2025-05-10 11:01:06 | Re: Proposal for enabling auto-vectorization for checksum calculations |
Previous Message | Amit Kapila | 2025-05-10 07:00:39 | Re: POC: enable logical decoding when wal_level = 'replica' without a server restart |