Re: Progress reporting for pg_verify_checksums

From: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
To: Michael Banck <michael(dot)banck(at)credativ(dot)de>
Cc: pgsql-hackers <pgsql-hackers(at)postgresql(dot)org>, Alvaro Herrera <alvherre(at)2ndquadrant(dot)com>
Subject: Re: Progress reporting for pg_verify_checksums
Date: 2018-09-29 07:35:35
Message-ID: alpine.DEB.2.21.1809290813200.10583@lancre
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers


>>>> The time() granularity to the second makes the result awkward on small
>>>> tests:
>>>>
>>>> 8/1554552 kB (0%, 8 kB/s)
>>>> 1040856/1554552 kB (66%, 1040856 kB/s)
>>>> 1554552/1554552 kB (100%, 1554552 kB/s)
>>>>
>>>> I'd suggest to reuse the "portability/instr_time.h" stuff which allows a
>>>> much better precision.
>>
>> I still think it would make sense to use that instead of low-precision
>> time().
>
> As the use-case of this is not for small tests,

Even for a long run, the induce error by the 1 second imprecision on both
points is significant at the beginning of the scan.

> I don't think it is useful to make the code more complicated for this.

I do not think that it would be much more complicated to use the
portability macros to get a precise time.

>>> The display is exactly the same as in pg_basebackup (except the
>>> bandwith is shown as well), so right now I think it is more useful to be
>>> consistent here.
>>
>> Hmmm... units are units, and the display is wrong. The fact that other
>> commands are wrong as well does not look like a good argument to persist in
>> an error.
>
> I've had a look around, and "kB" seems to be a common unit for 1024
> bytes, e.g. in pg_test_fsync etc., unless I am mistaken?

I can only repeat my above comment: the fact that postgres is wrong
elsewhere is not a good reason to persist in reproducing an error.

See the mother of all knowledge https://en.wikipedia.org/wiki/Kilobyte

- SI (decimal, 1000): kB, MB, GB, ...
- IEC (binary 1024): KiB, MiB, GiB, ...
- JEDEC (binary 1024, used for memory): KB, MB, GB.

Summary:
- 1 kB = 1000 bytes
- 1 KB = 1 KiB = 1024 bytes

Decimal is used for storage (HDD, SSD), binary for memory. That is life,
and IMHO Postgres code is not the place to invent new units.

Moreover, I still think that the progress should use MB defined as 1000000
bytes for showing the progress.

>> I would be okay with a progress printing function shared between some
>> commands. It just needs some place. pg_rewind also has a --rewind option.
>
> I guess you mean pg_rewind also has a --progress option.

Indeed.

> I do agree it makes sense to refactor that, but I don't think this
> should be part of this patch.

That's a point. I'd suggest to put the new progress function directly in
the common part, and other patches could take advantage of it for other
commands when someone feels like it.

Other comments:

function toggle_progress_report has empty lines after { and before },
but this style is not used elsewhere, I think that they should be removed.

The report_progress API should be ready to be used by another client
application, which suggest that global variables should be avoided.

Maybe:

void report_progress(current, total, force)

and the scan start and last times could be a static variable inside the
function initialized on the first call, which would suggest to call the
function at the beginning of the scan, probably with current == 0.

Or maybe:

time_type report_progress(current, total, start, last, force)

Which would return the last time, and the caller has responsability for
initializing a start time.

--
Fabien.

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Matteo Beccati 2018-09-29 07:51:22 Re: [HACKERS] kqueue
Previous Message Pavel Stehule 2018-09-29 07:29:34 Re: Optional message to user when terminating/cancelling backend