Re: pg_basebackup --progress output for batch execution

From: Martín Marqués <martin(at)2ndquadrant(dot)com>
To: Jeff Janes <jeff(dot)janes(at)gmail(dot)com>
Cc: PostgreSQL mailing lists <pgsql-hackers(at)postgresql(dot)org>
Subject: Re: pg_basebackup --progress output for batch execution
Date: 2017-11-10 13:32:23
Message-ID: CAPdiE1w1Sz_8qrzBEKndK1HO=N1JWWNyuyS9nucVdbNO+pqwww@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

Hi,

Thanks for having a look at this patch.

2017-11-09 20:55 GMT-03:00 Jeff Janes <jeff(dot)janes(at)gmail(dot)com>:
> On Fri, Sep 29, 2017 at 4:00 PM, Martin Marques
> <martin(dot)marques(at)2ndquadrant(dot)com> wrote:
>>
>> Hi,
>>
>> Some time ago I had to work on a system where I was cloning a standby
>> using pg_basebackup, that didn't have screen or tmux. For that reason I
>> redirected the output to a file and ran it with nohup.
>>
>> I normally (always actually ;) ) run pg_basebackup with --progress and
>> --verbose so I can follow how much has been done. When done on a tty you
>> get a nice progress bar with the percentage that has been cloned.
>>
>> The problem came with the execution and redirection of the output, as
>> the --progress option will write a *very* long line!
>>
>> Back then I thought of writing a patch (actually someone suggested I do
>> so) to add a --batch-mode option which would change the behavior
>> pg_basebackup has when printing the output messages.
>
>
>
> While separate lines in the output file is better than one very long line,
> it still doesn't seem so useful. If you aren't watching it in real time,
> then you really need to have a timestamp on each line so that you can
> interpret the output. The lines are about one second apart, but I don't
> know robust that timing is under all conditions.

I kind of disagree with your view here.

If the cloning process takes many hours to complete (in my case, it
was around 12 hours IIRC) you might want to peak at the log every now
and then with tail.

I do agree on adding a timestamp prefix to each line, as it's not
clear from the code how often progress_report() is called.

> I think I agree with Arthur that I'd rather have the decision made by
> inspecting whether output is going to a tty, rather than by adding another
> command line option. But maybe that is not detected robustly enough across
> all platforms and circumstances?

In this case, Arthur's idea is good, but would make the patch less
generic/flexible for the end user.

That's why I tried to reproduce what top does when executed with -b
(Batch mode operation). There, it's the end user who decides how the
output is formatted (well, saying it decides on formatting a bit of an
overstatement, but you get it ;) )

An example where using isatty() might fail is if you run pg_basebackup
from a tty but redirect the output to a file, I believe that in that
case isatty() will return true, but it's very likely that the user
might want batch mode output.

But maybe we should also add Arthurs idea anyway (when not in batch
mode), as it seems pretty lame to output progress in one line if you
are not in a tty.

Thoughts?

--
Martín Marqués http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Pavel Stehule 2017-11-10 13:40:21 Re: Transform for pl/perl
Previous Message Etsuro Fujita 2017-11-10 11:36:01 Re: Another oddity in handling of WCO constraints in postgres_fdw