Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Date: 2020-02-03 14:04:49
Message-ID: 5b3419c6-31bf-1966-efe3-eaa7ec652754@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/02/03 16:28, Amit Langote wrote:
> On Mon, Feb 3, 2020 at 1:17 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>> On 2020/02/02 14:59, Masahiko Sawada wrote:
>>> On Fri, 31 Jan 2020 at 02:29, Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>>> On 2020/01/30 12:58, Kyotaro Horiguchi wrote:
>>>>> + WHEN 3 THEN 'stopping backup'::text
>>>>>
>>>>> I'm not sure, but the "stop" seems suggesting the backup is terminated
>>>>> before completion. If it is following the name of the function
>>>>> pg_stop_backup, I think the name is suggesting to stop "the state for
>>>>> performing backup", not a backup.
>>>>>
>>>>> In the first place, the progress is about "backup" so it seems strange
>>>>> that we have another phase after the "stopping backup" phase. It
>>>>> might be better that it is "finishing file transfer" or such.
>>>>>
>>>>> "initializing"
>>>>> -> "starting file transfer"
>>>>> -> "transferring files"
>>>>> -> "finishing file transfer"
>>>>> -> "transaferring WAL"
>>>>
>>>> Better name is always welcome! If "stopping back" is confusing,
>>>> what about "performing pg_stop_backup"? So
>>>>
>>>> initializing
>>>> performing pg_start_backup
>>>> streaming database files
>>>> performing pg_stop_backup
>>>> transfering WAL files
>>>
>>> Another idea I came up with is to show steps that take time instead of
>>> pg_start_backup/pg_stop_backup, for better understanding the
>>> situation. That is, "performing checkpoint" and "performing WAL
>>> archive" etc, which engage the most of the time of these functions.
>>
>> Yeah, that's an idea. ISTM that "waiting for WAL archiving" sounds
>> better than "performing WAL archive". Thought?
>> I've not applied this change in the patch yet, but if there is no
>> other idea, I'd like to adopt this.
>
> If we are trying to "pg_stop_backup" in phase name, maybe we should
> avoid "pg_start_backup"? Then maybe:
>
> initializing
> starting backup / waiting for [ backup start ] checkpoint to finish
> transferring database files
> finishing backup / waiting for WAL archiving to finish
> transferring WAL files

So we now have the following ideas about the phase names for pg_basebackup.

1.
initializing

2.
2-1. starting backup
2-2. starting file transfer
2-3. performing pg_start_backup
2-4. performing checkpoint
2-5. waiting for [ backup start ] checkpoint to finish

3.
3-1. streaming backup
3-2. transferring database files
3-3. streaming database files
3-4. transferring files

4.
4-1. stopping backup
4-2. finishing file transfer
4-3. performing pg_stop_backup
4-4. finishing backup
4-5. waiting for WAL archiving to finish
4-6. performing WAL archive

5.
1. transferring wal
2. transferring WAL files

What conbination of these do you prefer?
> Some comments on documentation changes in v2 patch:
>
> + Amount of data already streamed.

Ok, fixed.

> "already" may be redundant. For example, in pg_start_progress_vacuum,
> heap_blks_scanned is described as "...blocks scanned", not "...blocks
> already scanned".
>
> + <entry><structfield>tablespace_total</structfield></entry>
> + <entry><structfield>tablespace_streamed</structfield></entry>
>
> Better to use plural tablespaces_total and tablespaces_streamed for consistency?

Fixed.

> + The WAL sender process is currently performing
> + <function>pg_start_backup</function> and setting up for
> + making a base backup.
>
> How about "taking" instead of "making" in the above sentence?

Fixed. Attached is the updated version of the patch.

>
> - <varlistentry>
> + <varlistentry id="protocol-replication-base-backup" xreflabel="BASE_BACKUP">
>
> I don't see any new text in the documentation patch that uses above
> xref, so no need to define it?

The following description that I added uses this.

certain commands during command execution. Currently, the only commands
which support progress reporting are <command>ANALYZE</command>,
<command>CLUSTER</command>,
- <command>CREATE INDEX</command>, and <command>VACUUM</command>.
+ <command>CREATE INDEX</command>, <command>VACUUM</command>,
+ and <xref linkend="protocol-replication-base-backup"/> (i.e., replication
+ command that <xref linkend="app-pgbasebackup"/> issues to take
+ a base backup).

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

Attachment Content-Type Size
pg_stat_progress_basebackup_v3.patch text/plain 17.8 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2020-02-03 14:07:09 Re: Complete data erasure
Previous Message Andres Freund 2020-02-03 13:49:18 Re: PATCH: standby crashed when replay block which truncated in standby but failed to truncate in master node