Re: Online checksums verification in the backend

From: Julien Rouhaud <rjuju123(at)gmail(dot)com>
To: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Online checksums verification in the backend
Date: 2020-04-04 09:04:28
Message-ID: 20200404090428.GD1206@nol
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Fri, Apr 03, 2020 at 11:39:11AM +0200, Julien Rouhaud wrote:
> On Fri, Apr 03, 2020 at 12:24:50PM +0900, Masahiko Sawada wrote:
> >
> > check_relation_fork() seems to quite depends on pg_check_relation()
> > because the returned tuplestore is specified by pg_check_relation().
> > It's just an idea but to improve reusability, how about moving
> > check_relation_fork() to checksumfunc.c? That is, in checksumfuncs.c
> > while iterating all blocks we call a new function in checksum.c, say
> > check_one_block() function, which has the following part and is
> > responsible for getting, checking the specified block and returning a
> > boolean indicating whether the block has corruption or not, along with
> > chk_found and chk_expected:
> >
> > /*
> > * To avoid too much overhead, the buffer will be first read without
> > * the locks that would guarantee the lack of false positive, as such
> > * events should be quite rare.
> > */
> > Retry:
> > if (!check_get_buffer(relation, forknum, blkno, buffer, force_lock,
> > &found_in_sb))
> > continue;
> >
> > if (check_buffer(buffer, blkno, &chk_expected, &chk_found))
> > continue;
> >
> > /*
> > * If we get a failure and the buffer wasn't found in shared buffers,
> > * reread the buffer with suitable lock to avoid false positive. See
> > * check_get_buffer for more details.
> > */
> > if (!found_in_sb && !force_lock)
> > {
> > force_lock = true;
> > goto Retry;
> > }
> >
> > A new function in checksumfuncs.c or pg_check_relation will be
> > responsible for storing the result to the tuplestore. That way,
> > check_one_block() will be useful for other use when we want to check
> > if the particular block has corruption with low overhead.
>
>
> Yes, I agree that passing the tuplestore isn't an ideal approach and some
> refactoring should probably happen. One thing is that this wouldn't be
> "check_one_block()" but "check_one_block_on_disk()" (which could also be from
> the OS cache). I'm not sure how useful it's in itself. It also raises some
> concerns about the throttling. I didn't change that for now, but I hope
> there'll be some other feedback about it.
>

I had some time this morning, so I did the suggested refactoring as it seems
like a way cleaner interface. I also kept the suggested check_one_block().

Attachment Content-Type Size
v7-0001-Add-a-pg_check_relation-function.patch text/plain 43.5 KB

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Amit Kapila 2020-04-04 09:09:32 Re: WAL usage calculation patch
Previous Message Amit Kapila 2020-04-04 08:57:56 Re: snapshot too old issues, first around wraparound and then more.