Re: more descriptive message for process termination due to max_slot_wal_keep_size

From: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
To: bdrouvot(at)amazon(dot)com
Cc: sawada(dot)mshk(at)gmail(dot)com, ashutosh(dot)bapat(dot)oss(at)gmail(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: more descriptive message for process termination due to max_slot_wal_keep_size
Date: 2022-09-06 05:53:36
Message-ID: 20220906.145336.1494945638376688936.horikyota.ntt@gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

At Mon, 5 Sep 2022 11:56:33 +0200, "Drouvot, Bertrand" <bdrouvot(at)amazon(dot)com> wrote in
> Hi,
>
> On 3/2/22 7:37 AM, Kyotaro Horiguchi wrote:
> > At Tue, 04 Jan 2022 10:29:31 +0900 (JST), Kyotaro
> > Horiguchi<horikyota(dot)ntt(at)gmail(dot)com> wrote in
> >> So what do you say if I propose the following?
> >>
> >> LOG: terminating process %d to release replication slot \"%s\"
> >> because its restart_lsn %X/%X exceeds the limit %X/%X
> >> HINT: You might need to increase max_slot_wal_keep_size.
> > This version emits the following message.
> >
> > [35785:checkpointer] LOG: terminating process 36368 to release
> > replication slot "s1" because its restart_lsn 0/1F000148 exceeds the
> > limit 0/21000000
> > [35785:checkpointer] HINT: You might need to increase
> > max_slot_wal_keep_size.
>
> As the hint is to increase max_slot_wal_keep_size, what about
> reporting the difference in size (rather than the limit lsn)?
> Something along those lines?
>
> [35785:checkpointer] LOG: terminating process 36368 to release
> replication slot "s1" because its restart_lsn 0/1F000148 exceeds the
> limit by <NNN MB>.

Thanks! That might be more sensible exactly for the reason you
mentioned. One issue doing that is size_pretty is dbsize.c local
function. Since the size is less than kB in many cases, we cannot use
fixed unit for that.

0001 and 0002 are the same with v5.

0003 exposes byte_size_pretty() to other modules.
0004 does the change by using byte_size_pretty()

After 0004 applied, they look like this.

> LOG: terminating process 108413 to release replication slot "rep3" because its restart_lsn 0/7000D8 exceeds the limit by 1024 kB
> HINT: You might need to increase max_slot_wal_keep_size.

The reason for "1024 kB" instead of "1 MB" is the precise value is a
bit less than 1024 * 1024.

regards.

-
Kyotaro Horiguchi
NTT Open Source Software Center

Attachment Content-Type Size
v6-0001-Make-a-message-on-process-termination-more-dscrip.patch text/x-patch 1.8 KB
v6-0002-Add-detailed-information-to-slot-invalidation-mes.patch text/x-patch 2.9 KB
v6-0003-Expose-byte_size_pretty-in-dbsize.c.patch text/x-patch 2.4 KB
v6-0004-Change-error-messages-to-show-the-difference-to-t.patch text/x-patch 3.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tom Lane 2022-09-06 05:57:53 Re: [PATCH] Renumber confusing value for GUC_UNIT_BYTE
Previous Message Kyotaro Horiguchi 2022-09-06 05:53:07 Re: Patch to address creation of PgStat* contexts with null parent context