Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side

From: Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com>
To: Amit Langote <amitlangote09(at)gmail(dot)com>
Cc: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Kyotaro Horiguchi <horikyota(dot)ntt(at)gmail(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side
Date: 2020-02-17 13:01:30
Message-ID: c34db59b-384e-a2db-4f7b-7037875f0aac@oss.nttdata.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 2020/02/06 11:35, Amit Langote wrote:
> On Wed, Feb 5, 2020 at 4:29 PM Amit Langote <amitlangote09(at)gmail(dot)com> wrote:
>> On Wed, Feb 5, 2020 at 3:36 PM Fujii Masao <masao(dot)fujii(at)oss(dot)nttdata(dot)com> wrote:
>>> Yeah, I understand your concern. The pg_basebackup document explains
>>> the risk when --progress is specified, as follows. Since I imagined that
>>> someone may explicitly disable --progress to avoid this risk, I made
>>> the server estimate the total size only when --progress is specified.
>>> But you think that this overhead by --progress is negligibly small?
>>>
>>> --------------------
>>> When this is enabled, the backup will start by enumerating the size of
>>> the entire database, and then go back and send the actual contents.
>>> This may make the backup take slightly longer, and in particular it will
>>> take longer before the first data is sent.
>>> --------------------
>>
>> Sorry, I hadn't read this before. So, my proposal would make this a lie.
>>
>> Still, if "streaming database files" is the longest phase, then not
>> having even an approximation of how much data is to be streamed over
>> doesn't much help estimating progress, at least as long as one only
>> has this view to look at.
>>
>> That said, the overhead of checking the size before sending any data
>> may be worse for some people than others, so having the option to
>> avoid that might be good after all.
>
> By the way, if calculating backup total size can take significantly
> long in some cases, that is when requested by specifying --progress,
> then it might be a good idea to define a separate phase for that, like
> "estimating backup size" or some such. Currently, it's part of
> "starting backup", which covers both running the checkpoint and size
> estimation which run one after another.

OK, I added this phase in the latest patch that I posted upthread.

Regards,

--
Fujii Masao
NTT DATA CORPORATION
Advanced Platform Technology Group
Research and Development Headquarters

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Fujii Masao 2020-02-17 13:21:23 Re: Wait event that should be reported while waiting for WAL archiving to finish
Previous Message Fujii Masao 2020-02-17 13:00:15 Re: pg_stat_progress_basebackup - progress reporting for pg_basebackup, in the server side