Re: Progress reporting for pg_verify_checksums

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Progress reporting for pg_verify_checksums
Date: 2018-09-19 15:17:05
Message-ID: alpine.DEB.2.21.1809191549340.8683@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


Hallo Michael,

>> This optionally prints the progress of pg_verify_checksums via read
>> kilobytes to the terminal with the new command line argument -P.
>>
>> If -P was forgotten and pg_verify_checksums operates on a large cluster,
>> the caller can send SIGUSR1 to pg_verify_checksums to turn progress
>> status reporting on during runtime.
>
> Version 2 of the patch is attached. This is rebased to master as of
> 422952ee and changes the signal handling to be a toggle as suggested by
> Alvaro.

Patch applies cleanly and compiles.

About tests: "make check" is okay, but ITSM that the command is not
started once, ever, in any test:-( It is unclear whether any test triggers
checksums anyway...

The time() granularity to the second makes the result awkward on small
tests:

8/1554552 kB (0%, 8 kB/s)
1040856/1554552 kB (66%, 1040856 kB/s)
1554552/1554552 kB (100%, 1554552 kB/s)

I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
much better precision.

The "kB" kilo-byte symbols stands for 1000 bytes, but you are displaying
1024-bytes kilos, which symbol is either "KiB" or "KB". Given that we are
about storage which is usually measured in powers of 1,000, so I'd suggest
to use thousands.

Another reserve I have is that on any hardware it is likely that there
will be a lot of kilos, so maybe megas (MB) should be used instead.

I'm wondering whether the bandwidth display should be local (from the last
display) or global (from the start of the command), but for the last
forced one. This is an open question.

Maybe it would be nice to show elapsed time and expected completion time
based on the current speed.

I would be in favor or having this turned on by default, and silenced with
some option. I'm not sure other people would agree, though, so it is an
open question as well.

About the code:

+ if (show_progress)
+ show_progress = false;
+ else
+ show_progress = true;

Why not a simple:

/* invert current state */
show_progress = !show_progress;

I do not see much advantage in using intermediate string buffers for
values:

+ snprintf(totalstr, sizeof(totalstr), INT64_FORMAT,
+ total_size / 1024);
+ snprintf(currentstr, sizeof(currentstr), INT64_FORMAT,
+ current_size / 1024);
+ snprintf(currspeedstr, sizeof(currspeedstr), INT64_FORMAT,
+ (current_size / 1024) / (((time(NULL) - scan_started) == 0) ? 1 : (time(NULL) - scan_started)));
+ fprintf(stderr, "%s/%s kB (%d%%, %s kB/s)",
+ currentstr, totalstr, total_percent, currspeedstr);

ISTM that just one simple fprintf would be fine:

fprintf(stderr, INT64_FORMAT "/" INT64_FORMAT " ...",
formulas for each slot...);

ISTM that this line overwriting trick deserves a comment:

+ if (isatty(fileno(stderr)))
+ fprintf(stderr, "\r");
+ else
+ fprintf(stderr, "\n");

And it runs a little amok with "-v".

+ memset(&act, '\0', sizeof(act));

pg uses 0 instead of '\0' everywhere else.

+ /* Make sure we just report at least once a second */
+ if ((now == last_progress_update) && !force) return;

The comments seems contradictory, I understand the code makes sure that it
is "at most" once a second. Pgbench uses "-P XXX" to strigger a progress
report every XXX second. I'm unsure whether it would be useful to allow
the user to change the 1 second display interval.

I'm not sure why you removed one unrelated line:

#include "storage/checksum.h"
#include "storage/checksum_impl.h"

-
static int64 files = 0;
static int64 blocks = 0;
static int64 badblocks = 0;

There is a problem in the scan_file code: The added sizeonly parameter is
not used. It should be removed.

--
Fabien.

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Joe Conway 2018-09-19 15:18:58 Re: doc - add missing documentation for "acldefault"
Previous Message Tom Lane 2018-09-19 14:54:44 Re: doc - add missing documentation for "acldefault"