Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>
Cc: Peter Smith <smithpb2250(at)gmail(dot)com>, Alvaro Herrera <alvherre(at)alvh(dot)no-ip(dot)org>, Greg Stark <stark(at)mit(dot)edu>, Euler Taveira <euler(at)eulerto(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: add log messages when replication slots become active and inactive (was Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?)
Date: 2023-11-13 12:13:13
Message-ID: CALj2ACXCCfi_UhFfvDtQbGqyjpf8X+3sBbz0_BsUp0rANuh8Eg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Nov 6, 2023 at 9:09 AM Amit Kapila <amit(dot)kapila16(at)gmail(dot)com> wrote:
>
> On Sun, Nov 5, 2023 at 4:01 AM Bharath Rupireddy
> <bharath(dot)rupireddyforpostgres(at)gmail(dot)com> wrote:
> >
> > On Thu, Nov 2, 2023 at 7:19 AM Peter Smith <smithpb2250(at)gmail(dot)com> wrote:
> > >
> > > But that's not quite compatible with what Alvaro [2] had written long
> > > back ("... the only acquisitions that would log messages are those in
> > > StartReplication and StartLogicalReplication.").
> > >
> > > In other words, ReplicationSlotAcquire/ReplicationSlotRelease is
> > > called by more places than you care to log from.
> >
> > I refreshed my thoughts for this patch and I think it's enough if
> > walsenders alone log messages when slots become active and inactive.
> > How about something like the attached v11 patch?
> >
>
> + * This function is currently used only in walsender.
> + */
> +void
> +ReplicationSlotAcquireAndLog(const char *name, bool nowait)
>
> BTW, is the reason for using it only in walsender is that it is a
> background process and it is not very apparent whether the slot is
> created, and for foreground processes, it is a bit clear when the
> command is executed.
>
> Can you please tell me the use case of this additional message?

Replication slot acquisitions and releases by backends say when
running pg_replication_slot_advance or pg_logical_slot_get_changes or
pg_drop_replication_slot or pg_create_{physical,
logical}_replication_slot are transient unlike walsenders which
comparatively hold slots for longer durations. Therefore, I've added
them only for walsenders. These messages help to know the lifetime of
a replication slot - one can know how long a streaming standby or
logical subscriber is down, IOW, how long a replication slot is
inactive in production. For instance, the time between released and
acquired slots in the below messages is the inactive replication slot
duration.

2023-11-13 11:06:34.338 UTC [470262] LOG: acquired physical
replication slot "sb_repl_slot"
2023-11-13 11:06:34.338 UTC [470262] STATEMENT: START_REPLICATION
SLOT "sb_repl_slot" 0/3000000 TIMELINE 1
2023-11-13 11:09:24.918 UTC [470262] LOG: released physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] LOG: acquired physical
replication slot "sb_repl_slot"
2023-11-13 12:01:40.530 UTC [470967] STATEMENT: START_REPLICATION
SLOT "sb_repl_slot" 0/3000000 TIMELINE 1

> If so, the other alternative is to either use a
> parameter to the existing function or directly use am_walsender flag
> to distinguish when to print the message in acquire/release calls.

Done that way. PSA v12.

> A few other minor comments:
> 1.
> + Causes each replication command and related activity to be logged in
> + the server log.
>
> Can we be bit more specific by changing to something like: "Causes
> each replication command and slot acquisition/release to be logged in
> the server log."

Done.

> 2.
> + ereport(log_replication_commands ? LOG : DEBUG1,
> + (errmsg("walsender process with PID %d acquired %s replication slot \"%s\"",
>
> It seems PID and process name is quite unlike what we print in other
> similar messages. For example, see below messages when we start
> replication via pub/sub :
>
> We can get the PID from the log line as for other logs and I don't see
> the process name printed anywhere else.

There was a comment upthread to have PID printed, but I agree to be
consistent and changed the messages to be: acquired physical/logical
replication slot "foo" and released physical/logical replication slot
"foo".

PSA v12 patch.

--
Bharath Rupireddy
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com

Attachment Content-Type Size
v12-0001-Emit-messages-when-replication-slots-become-acti.patch application/octet-stream 3.7 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2023-11-13 12:19:13 Re: pg_upgrade and logical replication
Previous Message Amit Kapila 2023-11-13 11:31:55 Re: pg_upgrade and logical replication