Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

From: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Date: 2022-11-15 11:41:52
Message-ID: CAMm1aWaFXLV4U8Fzxs9wgFqiY7ZToozhrGKTxYPbW3ZSpW-psQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

> v6 was not applying anymore, due to a change in
> doc/src/sgml/ref/checkpoint.sgml done by b9eb0ff09e (Rename
> pg_checkpointer predefined role to pg_checkpoint).
>
> Please find attached a rebase in v7.
>
> While working on this rebase, I also noticed that "pg_checkpointer" is
> still mentioned in some translation files:

Thanks for rebasing the patch and sharing the information.
---

> That said, back to this patch: I did not look closely but noticed that
> the buffers_total reported by pg_stat_progress_checkpoint:
>
> postgres=# select type,flags,start_lsn,phase,buffers_total,new_requests
> from pg_stat_progress_checkpoint;
> type | flags | start_lsn | phase
> | buffers_total | new_requests
> ------------+-----------------------+------------+-----------------------+---------------+--------------
> checkpoint | immediate force wait | 1/E6C523A8 | checkpointing
> buffers | 1024275 | false
> (1 row)
>
> is a little bit different from what is logged once completed:
>
> 2022-11-04 08:18:50.806 UTC [3488442] LOG: checkpoint complete: wrote
> 1024278 buffers (97.7%);

This is because the count shown in the checkpoint complete message
includes the additional increment done during SlruInternalWritePage().
We are not sure of this increment until it really happens. Hence it
was not considered in the patch. To make it compatible with the
checkpoint complete message, we should increment all three here,
buffers_total, buffers_processed and buffers_written. So the total
number of buffers calculated earlier may not always be the same. If
this looks good, I will update this in the next patch.

Thanks & Regards,
Nitin Jadhav

On Fri, Nov 4, 2022 at 1:57 PM Drouvot, Bertrand
<bertranddrouvot(dot)pg(at)gmail(dot)com> wrote:
>
> Hi,
>
> On 7/28/22 11:38 AM, Nitin Jadhav wrote:
> >>> To understand the performance effects of the above, I have taken the
> >>> average of five checkpoints with the patch and without the patch in my
> >>> environment. Here are the results.
> >>> With patch: 269.65 s
> >>> Without patch: 269.60 s
> >>
> >> Those look like timed checkpoints - if the checkpoints are sleeping a
> >> part of the time, you're not going to see any potential overhead.
> >
> > Yes. The above data is collected from timed checkpoints.
> >
> > create table t1(a int);
> > insert into t1 select * from generate_series(1,10000000);
> >
> > I generated a lot of data by using the above queries which would in
> > turn trigger the checkpoint (wal).
> > ---
> >
> >> To see whether this has an effect you'd have to make sure there's a
> >> certain number of dirty buffers (e.g. by doing CREATE TABLE AS
> >> some_query) and then do a manual checkpoint and time how long that
> >> times.
> >
> > For this case I have generated data by using below queries.
> >
> > create table t1(a int);
> > insert into t1 select * from generate_series(1,8000000);
> >
> > This does not trigger the checkpoint automatically. I have issued the
> > CHECKPOINT manually and measured the performance by considering an
> > average of 5 checkpoints. Here are the details.
> >
> > With patch: 2.457 s
> > Without patch: 2.334 s
> >
> > Please share your thoughts.
> >
>
> v6 was not applying anymore, due to a change in
> doc/src/sgml/ref/checkpoint.sgml done by b9eb0ff09e (Rename
> pg_checkpointer predefined role to pg_checkpoint).
>
> Please find attached a rebase in v7.
>
> While working on this rebase, I also noticed that "pg_checkpointer" is
> still mentioned in some translation files:
> "
> $ git grep pg_checkpointer
> src/backend/po/de.po:msgid "must be superuser or have privileges of
> pg_checkpointer to do CHECKPOINT"
> src/backend/po/ja.po:msgid "must be superuser or have privileges of
> pg_checkpointer to do CHECKPOINT"
> src/backend/po/ja.po:msgstr
> "CHECKPOINTを実行するにはスーパーユーザーであるか、またはpg_checkpointerの権限を持つ必要があります"
> src/backend/po/sv.po:msgid "must be superuser or have privileges of
> pg_checkpointer to do CHECKPOINT"
> "
>
> I'm not familiar with how the translation files are handled (looks like
> they have their own set of commits, see 3c0bcdbc66 for example) but
> wanted to mention that "pg_checkpointer" is still mentioned (even if
> that may be expected as the last commit related to translation files
> (aka 3c0bcdbc66) is older than the one that renamed pg_checkpointer to
> pg_checkpoint (aka b9eb0ff09e)).
>
> That said, back to this patch: I did not look closely but noticed that
> the buffers_total reported by pg_stat_progress_checkpoint:
>
> postgres=# select type,flags,start_lsn,phase,buffers_total,new_requests
> from pg_stat_progress_checkpoint;
> type | flags | start_lsn | phase
> | buffers_total | new_requests
> ------------+-----------------------+------------+-----------------------+---------------+--------------
> checkpoint | immediate force wait | 1/E6C523A8 | checkpointing
> buffers | 1024275 | false
> (1 row)
>
> is a little bit different from what is logged once completed:
>
> 2022-11-04 08:18:50.806 UTC [3488442] LOG: checkpoint complete: wrote
> 1024278 buffers (97.7%);
>
> Regards,
>
> --
> Bertrand Drouvot
> PostgreSQL Contributors Team
> RDS Open Source Databases
> Amazon Web Services: https://aws.amazon.com

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2022-11-15 11:51:44 Re: Assertion failure in SnapBuildInitialSnapshot()
Previous Message Jelte Fennema 2022-11-15 11:38:00 Re: Add non-blocking version of PQcancel