Re: more descriptive message for process termination due to max_slot_wal_keep_size

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: more descriptive message for process termination due to max_slot_wal_keep_size
Date: 2021-12-14 14:13:18
Message-ID: CALj2ACXmPefRz71OUpFtLF-kWd+bZHYAZ+cPPVSyJCBM1fcwtA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Dec 14, 2021 at 9:35 AM Kyotaro Horiguchi
<horikyota(dot)ntt(at)gmail(dot)com> wrote:
>
> Hello.
>
> As complained in pgsql-bugs [1], when a process is terminated due to
> max_slot_wal_keep_size, the related messages don't mention the root
> cause for *the termination*. Note that the third message does not
> show for temporary replication slots.
>
> [pid=a] LOG: "terminating process %d to release replication slot \"%s\""
> [pid=x] LOG: FATAL: terminating connection due to administrator command
> [pid=a] LOG: invalidting slot "s" because its restart_lsn X/X exceeds max_slot_wal_keep_size
>
> The attached patch attaches a DETAIL line to the first message.
>
> > [17605] LOG: terminating process 17614 to release replication slot "s1"
> + [17605] DETAIL: The slot's restart_lsn 0/2C0000A0 exceeds max_slot_wal_keep_size.
> > [17614] FATAL: terminating connection due to administrator command
> > [17605] LOG: invalidating slot "s1" because its restart_lsn 0/2C0000A0 exceeds max_slot_wal_keep_size
>
> Somewhat the second and fourth lines look inconsistent each other but
> that wouldn't be such a problem. I don't think we want to concatenate
> the two lines together as the result is a bit too long.
>
> > LOG: terminating process 17614 to release replication slot "s1" because it's restart_lsn 0/2C0000A0 exceeds max_slot_wal_keep_size.
>
> What do you think about this?
>
> [1] https://www.postgresql.org/message-id/20211214.101137.379073733372253470.horikyota.ntt%40gmail.com

+1 to give more context to the "terminating process %d to release
replication slot \"%s\"" message.

How about having below, instead of adding errdetail:
"terminating process %d to release replication slot \"%s\" whose
restart_lsn %X/%X exceeds max_slot_wal_keep_size"?

I think we can keep the "invalidating slot \"%s\" because its
restart_lsn %X/%X exceeds max_slot_wal_keep_size" message as-is. We
may not see "terminating process ..." and "invalidation slot ..."
messages together for the same slot, so having slightly different
wording is fine IMO.

Regards,
Bharath Rupireddy.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message John Naylor 2021-12-14 14:58:56 Re: cutting down the TODO list thread
Previous Message Robert Haas 2021-12-14 14:05:38 Re: Defining (and possibly skipping) useless VACUUM operations