Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

From: Magnus Hagander <magnus(at)hagander(dot)net>
To: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
Cc: Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, "Shinoda, Noriyoshi (PN Japan A&PS Delivery)" <noriyoshi(dot)shinoda(at)hpe(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, "amitlangote09(at)gmail(dot)com" <amitlangote09(at)gmail(dot)com>, "masahiko(dot)sawada(at)2ndquadrant(dot)com" <masahiko(dot)sawada(at)2ndquadrant(dot)com>, "pgsql-hackers(at)postgresql(dot)org" <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Date: 2020-03-06 17:54:09
Message-ID: CABUevEyi5Zrh06EFPTcrBFk+q=2ASYNymjMZ8qoNryL=gPbRpQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Mar 6, 2020 at 1:51 AM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>
>
>
> On 2020/03/06 0:45, Magnus Hagander wrote:
> > On Wed, Mar 4, 2020 at 11:15 PM Peter Eisentraut
> > <peter(dot)eisentraut(at)2ndquadrant(dot)com> wrote:
> >>
> >> On 2020-03-05 05:53, Fujii Masao wrote:
> >>> Or, as another approach, it might be worth considering to make
> >>> the server always estimate the total backup size whether --progress is
> >>> specified or not, as Amit argued upthread. If the time required to
> >>> estimate the backup size is negligible compared to total backup time,
> >>> IMO this approach seems better. If we adopt this, we can also get
> >>> rid of PROGESS option from BASE_BACKUP replication command.
> >>
> >> I think that would be preferable.
> >
> > From a UI perspective I definitely agree.
> >
> > The problem with that one is that it can take a non-trivlal amount of
> > time, that's why it was made an option (in the protocol) in the first
> > place. Particularly if you have a database with many small objets.
>
> Yeah, this is why I made the server estimate the total backup size
> only when --progress is specified.
>
> Another idea is;
> - Make pg_basebackup specify PROGRESS option in BASE_BACKUP command
> whether --progress is specified or not. This causes the server to estimate
> the total backup size even when users don't specify --progress.
> - Change pg_basebackup so that it treats --progress option as just a knob to
> determine whether to report the progress in a client-side.
> - Add new option like --no-estimate-backup-size (better name?) to
> pg_basebackup. If this option is specified, pg_basebackup doesn't use
> PROGRESS in BASE_BACKUP and the server doesn't estimate the backup size.
>
> I believe that the time required to estimate the backup size is not so large
> in most cases, so in the above idea, most users don't need to specify more
> option for the estimation. This is good for UI perspective.
>
> OTOH, users who are worried about the estimation time can use
> --no-estimate-backup-size option and skip the time-consuming estimation.

Personally, I think this is the best idea. it brings a "reasonable
default", since most people are not going to have this problem, and
yet a good way to get out from the issue for those that potentially
have it. Especially since we are now already showing the state that
"walsender is estimating the size", it should be easy enugh for people
to determine if they need to use this flag or not.

In nitpicking mode, I'd just call the flag --no-estimate-size -- it's
pretty clear things are about backups when you call pg_basebackup, and
it keeps the option a bit more reasonable in length.

--
Magnus Hagander
Me: https://www.hagander.net/
Work: https://www.redpill-linpro.com/

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Andres Freund 2020-03-06 17:58:59 Re: explain HashAggregate to report bucket and memory stats
Previous Message Robert Haas 2020-03-06 17:49:21 Re: [HACKERS] Moving relation extension locks out of heavyweight lock manager