Re: Progress reporting for pg_verify_checksums

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Kyotaro HORIGUCHI <horiguchi(dot)kyotaro(at)lab(dot)ntt(dot)co(dot)jp>, alvherre(at)2ndquadrant(dot)com, mailings(at)oopsware(dot)de, thomas(dot)munro(at)enterprisedb(dot)com, pgsql-hackers(at)postgresql(dot)org
Subject: Re: Progress reporting for pg_verify_checksums
Date: 2019-03-28 07:26:43
Message-ID: 1553758003.4884.39.camel@credativ.de
Views: Raw Message | Whole Thread | Download mbox
Thread:
Lists: pgsql-hackers

Hi,

Am Mittwoch, den 27.03.2019, 21:31 +0100 schrieb Fabien COELHO:
> > > I still think that the speed should compute a double to avoid integer
> > > rounding errors within the computation. ISTM that rounding should only be
> > > done for display in the end.
> >
> > Ok, also done this way. I decided to print only full MB and not any
> > further digits as those don't seem very relevant.
>
> For the computation to be in double, you must convert to double somewhere
> in the formula, otherwise it is computed as ints and only cast as a double
> afterwards. Maybe:
>
> current_speed = (1000.0 / MEGABYTES) * current_size / elapsed;
>
> Or anything which converts to double early.

Are you sure, seeing elapsed is a double already? If I print out two
additional fractional digits for current_speed, I get two non-zero
numbers, are those garbage?

Anyway, I've done that now as it doesn't hurt.

> Instead of casting percent to int, maybe use "%.0f" as well, just like
> current_speed?

Ok.

> + if ((blockno % 100 == 0) && show_progress)
>
> I'd invert the condition to avoid a modulo when not needed.

Ok.

> > > I'm okay with calling the report on each file even if this means every few
> > > GB...
> >
> > For my I/O throttling branch I've backed off to only call the report
> > every 100 blocks (and at the end of the file). I've added that logic to
> > this patch as well.
>
> The 100 blocks = 800 KiB looks arbitrary. What is the rational? ISTM that
> postgres will tend to store a power of two number of pages on full
> segments, so maybe there could be a rational for 1024. Not sure.

It was arbitrary. I've changed it to 1024 now which looks a bit better.

> > > Someone suggested ETA, and it seems rather simple to do. What about
> > > adding it?
> >
> > I don't know, just computing the ETA is easy of course. But you'd have
> > to fiddle with seconds/minutes/hours conversions cause the runtime could
> > be hours. Then, you don't want to have it bumping around weirdly when
> > your I/O rate changes, cf. https://xkcd.com/612/ That doesn't sounds
> > simpler than the signal trigger we might get rid of.
>
> Ok, it is more complicated that it looks for large sizes if second is not
> the right display unit.

Right, new version attached.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Unser Umgang mit personenbezogenen Daten unterliegt
folgenden Bestimmungen: https://www.credativ.de/datenschutz

Attachment Content-Type Size
pg_verify_checksums_progress_V14.patch text/x-patch 8.6 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Sergei Kornilov 2019-03-28 07:45:08 Re: Planning counters in pg_stat_statements (using pgss_store)
Previous Message Haribabu Kommi 2019-03-28 06:12:57 Re: Transaction commits VS Transaction commits (with parallel) VS query mean time