Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: "Shinoda, Noriyoshi (PN Japan A&PS Delivery)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "amitlangote09(at)gmail(dot)com" <amitlangote09(at)gmail(dot)com>, "masahiko(dot)sawada(at)2ndquadrant(dot)com" <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Date: 2020-03-05 04:53:33
Message-ID: 5191ae78-4a53-8864-3f50-4f1977b741cd@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/03/05 9:31, Magnus Hagander wrote:
> On Mon, Mar 2, 2020 at 10:03 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>
>>
>>
>> On 2020/03/03 14:37, Shinoda, Noriyoshi (PN Japan A&PS Delivery) wrote:
>>> Hi,
>>>
>>> Thank you for developing good features.
>>> The attached patch is a small fix to the committed documentation. This patch fixes the description literal for the backup_streamed column.
>>
>> Thanks for the report and patch! Pushed.
>
> This patch requires, AIUI, that you add -P to the pg_basebackup
> commandline in order to get the progress tracked in details
> serverside.

Whether --progress is enabled or not, the pg_stat_progress_basebackup
view report the progress of the backup in the server side. But the total
amount of data that will be streamed is estimated and reported only when
this option is enabled.

> But this also generates output in the client that one
> might not want.
>
> Should we perhaps have a switch in pg_basebackup that enables the
> server side tracking only, without generating output in the client?

Yes, this sounds reasonable.

I have two ideas.

(1) Extend --progress option so that it accepts the setting values like
none, server, both (better names?). If both is specified, PROGRESS
option is specified in BASE_BACKUP replication command and
the total backup size is estimated in the server side, but the progress
is not reported in a client side. If none, PROGRESS option is not
specified in BASE_BACKUP. The demerit of this idea is that --progress
option without argument is not supported yet and the existing
application using --progress option when using pg_basebackup needs
to be updated when upgrading PostgreSQL version to v13.

(2) Add something like --estimate-backup-size (better name?) option
to pg_basebackup. If it's specified, PROGRESS option is specified but
the progress is not reported in a client side.

Thought?

Or, as another approach, it might be worth considering to make
the server always estimate the total backup size whether --progress is
specified or not, as Amit argued upthread. If the time required to
estimate the backup size is negligible compared to total backup time,
IMO this approach seems better. If we adopt this, we can also get
rid of PROGESS option from BASE_BACKUP replication command.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Nelson 2020-03-05 05:06:33 Re: Change atoi to strtol in same place
Previous Message Fujii Masao 2020-03-05 04:23:52 Re: Identifying user-created objects