Inconsistency in reporting checkpointer stats

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: Pg Hackers <pgsql-hackers(at)postgresql(dot)org>
Subject: Inconsistency in reporting checkpointer stats
Date: 2022-12-14 07:31:49
Message-ID: CAMm1aWb18EpT0whJrjG+-nyhNouXET6ZUw0pNYYAe+NezpvsAA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

While working on checkpoint related stuff, I have encountered that
there is some inconsistency while reporting checkpointer stats. When a
checkpoint gets completed, a checkpoint complete message gets logged.
This message has a lot of information including the buffers written
(CheckpointStats.ckpt_bufs_written). This variable gets incremented in
2 cases. First is in BufferSync() and the second is in
SlruInternalWritePage(). On the other hand the checkpointer stats
exposed using pg_stat_bgwriter contains a lot of information including
buffers written (PendingCheckpointerStats.buf_written_checkpoints).
This variable gets incremented in only one place and that is in
BufferSync(). So there is inconsistent behaviour between these two
data. Please refer to the sample output below.

postgres=# select * from pg_stat_bgwriter;
checkpoints_timed | checkpoints_req | checkpoint_write_time |
checkpoint_sync_time | buffers_checkpoint | buffers_clean |
maxwritten_clean | buffers_backend | buffers_backend_fsync |
buffers_alloc | stats_reset
-------------------+-----------------+-----------------------+----------------------+--------------------+---------------+------------------+-----------------+-----------------------+---------------+-------------------------------
0 | 1 | 75 |
176 | 4702 | 0 | 0 |
4656 | 0 | 5023 | 2022-12-14
07:01:01.494672+00

2022-12-14 07:03:18.052 UTC [6087] LOG: checkpoint starting:
immediate force wait
2022-12-14 07:03:18.370 UTC [6087] LOG: checkpoint complete: wrote
4705 buffers (28.7%); 0 WAL file(s) added, 0 removed, 4 recycled;
write=0.075 s, sync=0.176 s, total=0.318 s; sync files=34,
longest=0.159 s, average=0.006 s; distance=66180 kB, estimate=66180
kB; lsn=0/5565E38, redo lsn=0/5565E00

In order to fix this, the
PendingCheckpointerStats.buf_written_checkpoints should be incremented
in SlruInternalWritePage() similar to
CheckpointStats.ckpt_bufs_written. I have attached the patch for the
same. Please share your thoughts.

Thanks & Regards,
Nitin Jadhav

Attachment Content-Type Size
v1-0001-Fix-inconsistency-in-checkpointer-stats.patch application/octet-stream 997 bytes

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2022-12-14 08:35:47 Re: generic plans and "initial" pruning
Previous Message Peter Eisentraut 2022-12-14 07:31:02 Common function for percent placeholder replacement