Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)

From: Andres Freund <andres(at)anarazel(dot)de>
To: "Drouvot, Bertrand" <bertranddrouvot(dot)pg(at)gmail(dot)com>
Cc: Nitin Jadhav <nitinjadhavpostgres(at)gmail(dot)com>, Bharath Rupireddy <bharath(dot)rupireddyforpostgres(at)gmail(dot)com>, Matthias van de Meent <boekewurm+postgres(at)gmail(dot)com>, Ashutosh Sharma <ashu(dot)coek88(at)gmail(dot)com>, Julien Rouhaud <rjuju123(at)gmail(dot)com>, Bruce Momjian <bruce(at)momjian(dot)us>, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Magnus Hagander <magnus(at)hagander(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Report checkpoint progress with pg_stat_progress_checkpoint (was: Report checkpoint progress in server logs)
Date: 2022-11-15 20:18:11
Message-ID: 20221115201811.6tair4cyb3skayc4@awork3.anarazel.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

On 2022-11-04 09:25:52 +0100, Drouvot, Bertrand wrote:
>
> @@ -7023,29 +7048,63 @@ static void
> CheckPointGuts(XLogRecPtr checkPointRedo, int flags)
> {
> CheckPointRelationMap();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_REPLI_SLOTS);
> CheckPointReplicationSlots();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_SNAPSHOTS);
> CheckPointSnapBuild();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_LOGICAL_REWRITE_MAPPINGS);
> CheckPointLogicalRewriteHeap();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_REPLI_ORIGIN);
> CheckPointReplicationOrigin();
>
> /* Write out all dirty data in SLRUs and the main buffer pool */
> TRACE_POSTGRESQL_BUFFER_CHECKPOINT_START(flags);
> CheckpointStats.ckpt_write_t = GetCurrentTimestamp();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_CLOG_PAGES);
> CheckPointCLOG();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_COMMITTS_PAGES);
> CheckPointCommitTs();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_SUBTRANS_PAGES);
> CheckPointSUBTRANS();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_MULTIXACT_PAGES);
> CheckPointMultiXact();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_PREDICATE_LOCK_PAGES);
> CheckPointPredicate();
> +
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_BUFFERS);
> CheckPointBuffers(flags);
>
> /* Perform all queued up fsyncs */
> TRACE_POSTGRESQL_BUFFER_CHECKPOINT_SYNC_START();
> CheckpointStats.ckpt_sync_t = GetCurrentTimestamp();
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_SYNC_FILES);
> ProcessSyncRequests();
> CheckpointStats.ckpt_sync_end_t = GetCurrentTimestamp();
> TRACE_POSTGRESQL_BUFFER_CHECKPOINT_DONE();
>
> /* We deliberately delay 2PC checkpointing as long as possible */
> + pgstat_progress_update_param(PROGRESS_CHECKPOINT_PHASE,
> + PROGRESS_CHECKPOINT_PHASE_TWO_PHASE);
> CheckPointTwoPhase(checkPointRedo);
> }

This is quite the code bloat. Can we make this less duplicative?

> +CREATE VIEW pg_stat_progress_checkpoint AS
> + SELECT
> + S.pid AS pid,
> + CASE S.param1 WHEN 1 THEN 'checkpoint'
> + WHEN 2 THEN 'restartpoint'
> + END AS type,
> + ( CASE WHEN (S.param2 & 4) > 0 THEN 'immediate ' ELSE '' END ||
> + CASE WHEN (S.param2 & 8) > 0 THEN 'force ' ELSE '' END ||
> + CASE WHEN (S.param2 & 16) > 0 THEN 'flush-all ' ELSE '' END ||
> + CASE WHEN (S.param2 & 32) > 0 THEN 'wait ' ELSE '' END ||
> + CASE WHEN (S.param2 & 128) > 0 THEN 'wal ' ELSE '' END ||
> + CASE WHEN (S.param2 & 256) > 0 THEN 'time ' ELSE '' END
> + ) AS flags,
> + ( '0/0'::pg_lsn +
> + ((CASE
> + WHEN S.param3 < 0 THEN pow(2::numeric, 64::numeric)::numeric
> + ELSE 0::numeric
> + END) +
> + S.param3::numeric)
> + ) AS start_lsn,

I don't think we should embed this much complexity in the view
defintions. It's hard to read, bloats the catalog, we can't fix them once
released. This stuff seems like it should be in a helper function.

I don't have any iea what that pow stuff is supposed to be doing.

> + to_timestamp(946684800 + (S.param4::float8 / 1000000)) AS start_time,

I don't think this is a reasonable path - embedding way too much low-level
details about the timestamp format in the view definition. Why do we need to
do this?

Greetings,

Andres Freund

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2022-11-15 20:22:12 Re: Schema variables - new implementation for Postgres 15
Previous Message Jacob Champion 2022-11-15 20:08:40 Re: Moving forward with TDE