Report checkpoint progress in server logs

From: Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>
To: PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Report checkpoint progress in server logs
Date: 2021-12-29 14:30:54
Message-ID: CALj2ACV-F+K+z+XW8fnK4MV71qz2gzAMxFnYziRgZURMB5ycAQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

At times, some of the checkpoint operations such as removing old WAL
files, dealing with replication snapshot or mapping files etc. may
take a while during which the server doesn't emit any logs or
information, the only logs emitted are LogCheckpointStart and
LogCheckpointEnd. Many times this isn't a problem if the checkpoint is
quicker, but there can be extreme situations which require the users
to know what's going on with the current checkpoint.

Given that the commit 9ce346ea [1] introduced a nice mechanism to
report the long running operations of the startup process in the
server logs, I'm thinking we can have a similar progress mechanism for
the checkpoint as well. There's another idea suggested in a couple of
other threads to have a pg_stat_progress_checkpoint similar to
pg_stat_progress_analyze/vacuum/etc. But the problem with this idea is
during the end-of-recovery or shutdown checkpoints, the
pg_stat_progress_checkpoint view isn't accessible as it requires a
connection to the server which isn't allowed.

Therefore, reporting the checkpoint progress in the server logs, much
like [1], seems to be the best way IMO. We can 1) either make
ereport_startup_progress and log_startup_progress_interval more
generic (something like ereport_log_progress and
log_progress_interval), move the code to elog.c, use it for
checkpoint progress and if required for other time-consuming
operations 2) or have an entirely different GUC and API for checkpoint
progress.

IMO, option (1) i.e. ereport_log_progress and log_progress_interval
(better names are welcome) seems a better idea.

Thoughts?

[1]
commit 9ce346eabf350a130bba46be3f8c50ba28506969
Author: Robert Haas <rhaas(at)postgresql(dot)org>
Date: Mon Oct 25 11:51:57 2021 -0400

Report progress of startup operations that take a long time.

Regards,
Bharath Rupireddy.

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Magnus Hagander 2021-12-29 14:35:40 Re: Report checkpoint progress in server logs
Previous Message Justin Pryzby 2021-12-29 13:59:16 Re: Proposal: More structured logging