Re: Online checksums verification in the backend

From: Michael Paquier <michael(at)paquier(dot)xyz>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Justin Pryzby <pryzby(at)telsasoft(dot)com>, Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Online checksums verification in the backend
Date: 2020-09-09 12:37:25
Message-ID: 20200909123725.GB2743@paquier.xyz
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Sep 09, 2020 at 11:25:24AM +0200, Julien Rouhaud wrote:
> I assumed that the odds of having to check the buffer twice were so low, and
> avoiding to keep a bufmapping lock while doing some IO was an uncontroversial
> enough optimisation, but maybe that's only wishful thinking.

Perhaps it is worth it, so it would be good to make sure of it and see
if that's actually worth the extra complication. This also depends if
the page is in the OS cache if the page is not in shared buffers,
meaning that smgrread() is needed to fetch the page to check. I would
be more curious to see if there is an actual difference if the page is
the OS cache.

> I'll do some becnhmarking and see if I can get some figures, but it'll probably
> take some time. In the meantime I'm attaching v11 of the patch that should
> address all other comments.

Thanks.

Another thing that was itching me is the introduction of 3 GUCs with
one new category for the sake of two SQL functions. For VACUUM we
have many things relying on the GUC delays, with autovacuum and manual
vacuum. Perhaps it would make sense to have these if we have some day
a background worker doing checksum verifications, still that could
perfectly be in contrib/, and that would be kind of hard to tune as
well. The patch enabling checksums on-the-fly could also be a reason
good enough. Another thing we could consider is to pass down those
parameters as function arguments, at the cost of not being able to
reload them.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Langote 2020-09-09 12:51:49 Re: [POC] Fast COPY FROM command for the table with foreign partitions
Previous Message Michael Paquier 2020-09-09 12:22:00 Re: Allow CLUSTER, VACUUM FULL and REINDEX to change tablespace on the fly