Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>
Cc: pgsql-hackers(at)postgresql(dot)org
Subject: Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Date: 2020-01-30 17:29:38
Message-ID: 20969226-5617-9c60-68a6-06b3768ce03f@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
> At Wed, 29 Jan 2020 23:16:08 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>> Hi,
>>
>> pg_basebackup reports the backup progress if --progress option is
>> specified,
>> and we can monitor it in the client side. I think that it's useful if
>> we can
>> monitor the progress information also in the server side because, for
>> example,
>> we can easily check that by using SQL. So I'd like to propose
>> pg_stat_progress_basebackup view that allows us to monitor the
>> progress
>> of pg_basebackup, in the server side. Thought?
>>
>> POC patch is attached.
>
> | postgres=# \d pg_stat_progress_basebackup
> | View "pg_catalog.pg_stat_progress_basebackup"
> | Column | Type | Collation | Nullable | Default
> | ---------------------+---------+-----------+----------+---------
> | pid | integer | | |
> | phase | text | | |
> | backup_total | bigint | | |
> | backup_streamed | bigint | | |
> | tablespace_total | bigint | | |
> | tablespace_streamed | bigint | | |
>
> I think the view needs client identity such like host/port pair
> besides pid (host/port pair fails identify client in the case of
> unix-sockets.).

I don't think this is job of a progress reporting. If those information
is necessary, we can join this view with pg_stat_replication using
pid column as the join key.

> Also elapsed time from session start might be
> useful.

+1
I think that not only pg_stat_progress_basebackup but also all the other
progress views should report the time when the target command was
started and the time when the phase was last changed. Those times
would be useful to estimate the remaining execution time from the
progress infromation.

> pg_stat_progress_acuum has datid, datname and relid.
>
> + if (backup_total > 0 && backup_streamed > backup_total)
> + {
> + backup_total = backup_streamed;
>
> Do we need the condition "backup_total > 0"?

I added the condition for the case where --progress option is not supplied
in pg_basebackup, i.e., the case where the total amount of backup is not
estimated at the beginning of the backup. In this case, total_backup is
always 0.

> + int tblspc_streamed = 0;
> +
> + pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
> + PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP);
>
> This starts "streaming backup" phase with backup_total = 0. Coudln't
> we move to that phase after setting backup total and tablespace total?
> That is, just after calling SendBackupHeader().

OK, that's a bit less confusing for users. I will change in that way.

> + WHEN 3 THEN 'stopping backup'::text
>
> I'm not sure, but the "stop" seems suggesting the backup is terminated
> before completion. If it is following the name of the function
> pg_stop_backup, I think the name is suggesting to stop "the state for
> performing backup", not a backup.
>
> In the first place, the progress is about "backup" so it seems strange
> that we have another phase after the "stopping backup" phase. It
> might be better that it is "finishing file transfer" or such.
>
> "initializing"
> -> "starting file transfer"
> -> "transferring files"
> -> "finishing file transfer"
> -> "transaferring WAL"

Better name is always welcome! If "stopping back" is confusing,
what about "performing pg_stop_backup"? So

initializing
performing pg_start_backup
streaming database files
performing pg_stop_backup
transfering WAL files

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Robert Haas 2020-01-30 17:36:00 Re: Enabling B-Tree deduplication by default
Previous Message Alvaro Herrera 2020-01-30 17:13:51 Re: parens cleanup