Re: add checkpoint stats of snapshot and mapping files of pg_logical dir

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Nathan Bossart <nathandbossart(at)gmail(dot)com>, Andres Freund <andres(at)anarazel(dot)de>, Cary Huang <cary(dot)huang(at)highgo(dot)ca>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: add checkpoint stats of snapshot and mapping files of pg_logical dir
Date: 2022-07-19 11:59:10
Message-ID: CALj2ACWjjcmsdtDp7y2k6Q7gyq_2+9Ccrt_G=ZVEAuc7RE8evw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Tue, Apr 26, 2022 at 6:31 AM Michael Paquier <michael(at)paquier(dot)xyz> wrote:
>
> On Mon, Apr 25, 2022 at 01:34:38PM -0700, Nathan Bossart wrote:
> > I took another look at the example output, and I think I agree that logging
> > the total time for logical decoding operations is probably the best path
> > forward. This information would be enough to clue an administrator into
> > the possible causes of lengthy checkpoints, but it also wouldn't disrupt
> > the readability of the log statement too much.
>
> + /* translator: the placeholders after first %s show restartpoint/checkpoint options */
> + (errmsg("%s starting:%s%s%s%s%s%s%s%s",
> + restartpoint ?
> _("restartpoint") : _("checkpoint"),
>
> 0001 breaks translation, as "checkpoint/restartpoint" and "starting"
> would treated as separate terms to translate. That would not matter
> for English, but it does in French where we'd say "début du
> checkpoint". You could fix that by adding "starting" to each
> refactored term or build a string. 0002 does the latter, so my take
> is that you should begin using a StringInfo in 0001.

Thanks for reviewing. I've addressed the review comments, PSA v10
patch. Note that we can't use StringInfo as the checkpointer memory
context doesn't allow pallocs in the critical section and the
checkpoint can sometimes be run in the critical section.

I've also added the total number of WAL files a checkpoint has
processed (scanned the pg_wal directory) while removing old WAL files.
This helps to estimate the pg_wal disk space at the time of a
particular checkpoint, especially useful for debugging issues.

[1] sample output:
2022-07-19 10:33:45.378 UTC [3027866] LOG: checkpoint starting: wal
2022-07-19 10:33:51.434 UTC [3027866] LOG: checkpoint complete: wrote
50 buffers (0.3%); 0 WAL file(s) added, 12 removed, 35 recycled, 76
processed; write=3.651 s, sync=0.011 s, total=6.136 s; sync files=11,
longest=0.004 s, average=0.001 s; distance=770045 kB, estimate=770045
kB; lsn=0/95000260, redo lsn=0/79000068; logical decoding file(s)
processing=0.007 s

Regards,
Bharath Rupireddy.

Attachment Content-Type Size
v10-0001-Add-time-taken-for-processing-logical-decoding-f.patch application/x-patch 6.2 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Justin Pryzby 2022-07-19 12:09:48 errdetail/errhint style
Previous Message Amit Kapila 2022-07-19 11:56:53 Re: [BUG] Logical replication failure "ERROR: could not map filenode "base/13237/442428" to relation OID" with catalog modifying txns