Re: Progress reporting for pg_verify_checksums

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Progress reporting for pg_verify_checksums
Date: 2018-09-26 08:46:06
Message-ID: alpine.DEB.2.21.1809201058240.2239@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hallo Michael,

About patch v3: applies cleanly and compiles.

The xml documentation should be updated! (It is kind of hard to notice
what is not there:-)

See "doc/src/sgml/ref/pg_verify_checksums.sgml".

>> The time() granularity to the second makes the result awkward on small
>> tests:
>>
>> 8/1554552 kB (0%, 8 kB/s)
>> 1040856/1554552 kB (66%, 1040856 kB/s)
>> 1554552/1554552 kB (100%, 1554552 kB/s)
>>
>> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
>> much better precision.

I still think it would make sense to use that instead of low-precision
time().

>> The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
>> 1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
>> about storage which is usually measured in powers of 1,000, so I'd suggest
>> to use thousands.
>>
>> Another reserve I have is that on any hardware it is likely that there will
>> be a lot of kilos, so maybe megas (MB) should be used instead.
>
> The display is exactly the same as in pg_basebackup (except the
> bandwith is shown as well), so right now I think it is more useful to be
> consistent here.

Hmmm... units are units, and the display is wrong. The fact that other
commands are wrong as well does not look like a good argument to persist
in an error.

> So if we change that, pg_basebackup should be changed as well I think.

I'm okay with fixing pg_basebackup as well! I'm unsure about the best
place to put such a function, though. Probably under "src/common/" where
there is already some front-end specific code ("fe_memutils.c").

> Maybe the code could be factored out to some common file in the future.

I would be okay with a progress printing function shared between some
commands. It just needs some place. pg_rewind also has a --rewind option.

>> Maybe it would be nice to show elapsed time and expected completion time
>> based on the current speed.
>
> Maybe; this could be added to the factored out common code I mentioned
> above.

Yep.

>> I would be in favor or having this turned on by default, and silenced with
>> some option. I'm not sure other people would agree, though, so it is an open
>> question as well.
>
> If this runs in a script, it would be pretty annoying, so everybody
> would have to add --no-progress so I am not convinced. Also, both
> pg_basebackup and pgbench don't show progress by default.
>

Ok.

>> I do not see much advantage in using intermediate string buffers for values:
>>
>> + snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
>> + total_size / 1024);
>> + snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
>> + current_size / 1024);
>> + snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
>> + (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started)));
>> + fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
>> + currentstr, totalstr, total_percent, currspeedstr);
>>
>> ISTM that just one simple fprintf would be fine:
>>
>> fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
>> formulas for each slot...);
>
> That code is adopted from pg_basebackup.c and the comment there says:
>
> | * Separate step to keep platform-dependent format code out of
> | * translatable strings. And we only test for INT64_FORMAT
> | * availability in snprintf, not fprintf.
>
> So that sounds legit to me.

Hmmm. Translation. Not sure I like much the idea of translating units,
but why not.

> | * If we are reporting to a terminal, send a carriage return so that
> | * we stay on the same line. If not, send a newline.
>
>> And it runs a little amok with "-v".
>
> Do you suggest we should make those mutually exlusive? I agree that in
> general, -P is not very useful if -v is on, but if you have a really big
> table, it should still be, no?

Yep. I was just mentionning that they interfere on a terminal, but I agree
that there is no obvious fix.

>> + memset(&act, '\0', sizeof(act));
>>
>> pg uses 0 instead of '\0' everywhere else.
>
> Ok.

Not '0', plain 0 (the integer of value zero).

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Tomas Vondra 2018-09-26 09:10:45 Re: [PATCH] Improve geometric types
Previous Message Fabien COELHO 2018-09-26 08:26:41 Re: pgbench - add pseudo-random permutation function