Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Magnus Hagander <magnus(at)hagander(dot)net>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>
Cc: "Shinoda, Noriyoshi (PN Japan A&PS Delivery)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "amitlangote09(at)gmail(dot)com" <amitlangote09(at)gmail(dot)com>, "masahiko(dot)sawada(at)2ndquadrant(dot)com" <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Date: 2020-03-06 09:51:55
Message-ID: 229c872f-d8c7-ee65-fb03-299850c7da93@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/03/06 0:45, Magnus Hagander wrote:
> On Wed, Mar 4, 2020 at 11:15 PM Peter Eisentraut
> <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
>>
>> On 2020-03-05 05:53, Fujii Masao wrote:
>>> Or, as another approach, it might be worth considering to make
>>> the server always estimate the total backup size whether --progress is
>>> specified or not, as Amit argued upthread. If the time required to
>>> estimate the backup size is negligible compared to total backup time,
>>> IMO this approach seems better. If we adopt this, we can also get
>>> rid of PROGESS option from BASE_BACKUP replication command.
>>
>> I think that would be preferable.
>
> From a UI perspective I definitely agree.
>
> The problem with that one is that it can take a non-trivlal amount of
> time, that's why it was made an option (in the protocol) in the first
> place. Particularly if you have a database with many small objets.

Yeah, this is why I made the server estimate the total backup size
only when --progress is specified.

Another idea is;
- Make pg_basebackup specify PROGRESS option in BASE_BACKUP command
whether --progress is specified or not. This causes the server to estimate
the total backup size even when users don't specify --progress.
- Change pg_basebackup so that it treats --progress option as just a knob to
determine whether to report the progress in a client-side.
- Add new option like --no-estimate-backup-size (better name?) to
pg_basebackup. If this option is specified, pg_basebackup doesn't use
PROGRESS in BASE_BACKUP and the server doesn't estimate the backup size.

I believe that the time required to estimate the backup size is not so large
in most cases, so in the above idea, most users don't need to specify more
option for the estimation. This is good for UI perspective.

OTOH, users who are worried about the estimation time can use
--no-estimate-backup-size option and skip the time-consuming estimation.

Thought?

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Arseny Sher 2020-03-06 11:02:53 Re: logical copy_replication_slot issues
Previous Message Daniel Verite 2020-03-06 09:36:51 Re: Making psql error out on output failures