Re: Online verification of checksums

From: Tomas Vondra <tomas(dot)vondra(at)2ndquadrant(dot)com>
To: Stephen Frost <sfrost(at)snowman(dot)net>, Michael Paquier <michael(at)paquier(dot)xyz>
Cc: Michael Banck <michael(dot)banck(at)credativ(dot)de>, Fabien COELHO <coelho(at)cri(dot)ensmp(dot)fr>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>
Subject: Re: Online verification of checksums
Date: 2018-09-29 15:49:55
Message-ID: 5477fe69-afb8-f759-2d45-680b187a2b81@2ndquadrant.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On 09/29/2018 02:14 PM, Stephen Frost wrote:
> Greetings,
>
> * Michael Paquier (michael(at)paquier(dot)xyz) wrote:
>> On Sat, Sep 29, 2018 at 10:51:23AM +0200, Tomas Vondra wrote:
>>> One more thought - when running similar tools on a live system, it's
>>> usually a good idea to limit the impact by throttling the throughput. As
>>> the verification runs in an independent process it can't reuse the
>>> vacuum-like cost limit directly, but perhaps it could do something
>>> similar? Like, limit the number of blocks read/second, or so?
>>
>> When it comes to such parameters, not using a number of blocks but
>> throttling with a value in bytes (kB or MB of course) speaks more to the
>> user. The past experience with checkpoint_segments is one example of
>> that. Converting that to a number of blocks internally would definitely
>> make sense the most sense. +1 for this idea.
>
> While I agree this would be a nice additional feature to have, it seems
> like something which could certainly be added later and doesn't
> necessairly have to be included in the initial patch. If Michael has
> time to add that, great, if not, I'd rather have this as-is than not.
>

True, although I don't think it'd be particularly difficult.

> I do tend to agree with Michael that having the parameter be specified
> as (or at least able to accept) a byte-based value is a good idea.

Sure, I was not really expecting it to be exposed as raw block count. I
agree it should be in byte-based values (i.e. just like --max-rate in
pg_basebackup).

> As another feature idea, having this able to work in parallel across
> tablespaces would be nice too. I can certainly imagine some point where
> this is a default process which scans the database at a slow pace across
> all the tablespaces more-or-less all the time checking for corruption.
>

Maybe, but that's certainly a non-trivial feature.

regards

--
Tomas Vondra http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message David Fetter 2018-09-29 15:56:57 Re: Adding pipe support to pg_dump and pg_restore
Previous Message Tom Lane 2018-09-29 15:42:40 Re: Adding pipe support to pg_dump and pg_restore