Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Date: 2020-02-03 04:17:17
Message-ID: 4f43097e-507e-1571-f1ee-a01c4a0d1db4@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/02/02 14:59, Masahiko Sawada wrote:
> On Fri, 31 Jan 2020 at 02:29, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>>
>> On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
>>> At Wed, 29 Jan 2020 23:16:08 +0900, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote in
>>>> Hi,
>>>>
>>>> pg_basebackup reports the backup progress if --progress option is
>>>> specified,
>>>> and we can monitor it in the client side. I think that it's useful if
>>>> we can
>>>> monitor the progress information also in the server side because, for
>>>> example,
>>>> we can easily check that by using SQL. So I'd like to propose
>>>> pg_stat_progress_basebackup view that allows us to monitor the
>>>> progress
>>>> of pg_basebackup, in the server side. Thought?
>>>>
>>>> POC patch is attached.
>>>
>>> | postgres=# \d pg_stat_progress_basebackup
>>> | View "pg_catalog.pg_stat_progress_basebackup"
>>> | Column | Type | Collation | Nullable | Default
>>> | ---------------------+---------+-----------+----------+---------
>>> | pid | integer | | |
>>> | phase | text | | |
>>> | backup_total | bigint | | |
>>> | backup_streamed | bigint | | |
>>> | tablespace_total | bigint | | |
>>> | tablespace_streamed | bigint | | |
>>>
>>> I think the view needs client identity such like host/port pair
>>> besides pid (host/port pair fails identify client in the case of
>>> unix-sockets.).
>>
>> I don't think this is job of a progress reporting. If those information
>> is necessary, we can join this view with pg_stat_replication using
>> pid column as the join key.
>>
>>> Also elapsed time from session start might be
>>> useful.
>>
>> +1
>> I think that not only pg_stat_progress_basebackup but also all the other
>> progress views should report the time when the target command was
>> started and the time when the phase was last changed. Those times
>> would be useful to estimate the remaining execution time from the
>> progress infromation.
>>
>>> pg_stat_progress_acuum has datid, datname and relid.
>>>
>>> + if (backup_total > 0 && backup_streamed > backup_total)
>>> + {
>>> + backup_total = backup_streamed;
>>>
>>> Do we need the condition "backup_total > 0"?
>>
>> I added the condition for the case where --progress option is not supplied
>> in pg_basebackup, i.e., the case where the total amount of backup is not
>> estimated at the beginning of the backup. In this case, total_backup is
>> always 0.
>>
>>> + int tblspc_streamed = 0;
>>> +
>>> + pgstat_progress_update_param(PROGRESS_BASEBACKUP_PHASE,
>>> + PROGRESS_BASEBACKUP_PHASE_STREAM_BACKUP);
>>>
>>> This starts "streaming backup" phase with backup_total = 0. Coudln't
>>> we move to that phase after setting backup total and tablespace total?
>>> That is, just after calling SendBackupHeader().
>>
>> OK, that's a bit less confusing for users. I will change in that way.

Fixed. Attached is the updated version of the patch.
I also fixed the regression test failure.

>>
>>> + WHEN 3 THEN 'stopping backup'::text
>>>
>>> I'm not sure, but the "stop" seems suggesting the backup is terminated
>>> before completion. If it is following the name of the function
>>> pg_stop_backup, I think the name is suggesting to stop "the state for
>>> performing backup", not a backup.
>>>
>>> In the first place, the progress is about "backup" so it seems strange
>>> that we have another phase after the "stopping backup" phase. It
>>> might be better that it is "finishing file transfer" or such.
>>>
>>> "initializing"
>>> -> "starting file transfer"
>>> -> "transferring files"
>>> -> "finishing file transfer"
>>> -> "transaferring WAL"
>>
>> Better name is always welcome! If "stopping back" is confusing,
>> what about "performing pg_stop_backup"? So
>>
>> initializing
>> performing pg_start_backup
>> streaming database files
>> performing pg_stop_backup
>> transfering WAL files
>
> Another idea I came up with is to show steps that take time instead of
> pg_start_backup/pg_stop_backup, for better understanding the
> situation. That is, "performing checkpoint" and "performing WAL
> archive" etc, which engage the most of the time of these functions.

Yeah, that's an idea. ISTM that "waiting for WAL archiving" sounds
better than "performing WAL archive". Thought?
I've not applied this change in the patch yet, but if there is no
other idea, I'd like to adopt this.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachment Content-Type Size
pg_stat_progress_basebackup_v2.patch text/plain 17.9 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-02-03 04:21:12 Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions
Previous Message Chris Travers 2020-02-03 02:37:01 Re: Internal key management system