Re: Online checksums verification in the backend

From: Masahiko Sawada <masahiko(dot)sawada(at)2ndquadrant(dot)com>
To: Julien Rouhaud <rjuju123(at)gmail(dot)com>
Cc: Michael Paquier <michael(at)paquier(dot)xyz>, Robert Haas <robertmhaas(at)gmail(dot)com>, PostgreSQL Hackers <pgsql-hackers(at)lists(dot)postgresql(dot)org>, Masahiko Sawada <sawada(dot)mshk(at)gmail(dot)com>
Subject: Re: Online checksums verification in the backend
Date: 2020-04-05 04:13:30
Message-ID: CA+fd4k7c83ViCvZtochX9msnyigDjcNujphFSXdnd7QOaWXSqA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Sat, 4 Apr 2020 at 18:04, Julien Rouhaud <rjuju123(at)gmail(dot)com> wrote:
>
> On Fri, Apr 03, 2020 at 11:39:11AM +0200, Julien Rouhaud wrote:
> > On Fri, Apr 03, 2020 at 12:24:50PM +0900, Masahiko Sawada wrote:
> > >
> > > check_relation_fork() seems to quite depends on pg_check_relation()
> > > because the returned tuplestore is specified by pg_check_relation().
> > > It's just an idea but to improve reusability, how about moving
> > > check_relation_fork() to checksumfunc.c? That is, in checksumfuncs.c
> > > while iterating all blocks we call a new function in checksum.c, say
> > > check_one_block() function, which has the following part and is
> > > responsible for getting, checking the specified block and returning a
> > > boolean indicating whether the block has corruption or not, along with
> > > chk_found and chk_expected:
> > >
> > > /*
> > > * To avoid too much overhead, the buffer will be first read without
> > > * the locks that would guarantee the lack of false positive, as such
> > > * events should be quite rare.
> > > */
> > > Retry:
> > > if (!check_get_buffer(relation, forknum, blkno, buffer, force_lock,
> > > &found_in_sb))
> > > continue;
> > >
> > > if (check_buffer(buffer, blkno, &chk_expected, &chk_found))
> > > continue;
> > >
> > > /*
> > > * If we get a failure and the buffer wasn't found in shared buffers,
> > > * reread the buffer with suitable lock to avoid false positive. See
> > > * check_get_buffer for more details.
> > > */
> > > if (!found_in_sb && !force_lock)
> > > {
> > > force_lock = true;
> > > goto Retry;
> > > }
> > >
> > > A new function in checksumfuncs.c or pg_check_relation will be
> > > responsible for storing the result to the tuplestore. That way,
> > > check_one_block() will be useful for other use when we want to check
> > > if the particular block has corruption with low overhead.
> >
> >
> > Yes, I agree that passing the tuplestore isn't an ideal approach and some
> > refactoring should probably happen. One thing is that this wouldn't be
> > "check_one_block()" but "check_one_block_on_disk()" (which could also be from
> > the OS cache). I'm not sure how useful it's in itself. It also raises some
> > concerns about the throttling. I didn't change that for now, but I hope
> > there'll be some other feedback about it.
> >
>
>
> I had some time this morning, so I did the suggested refactoring as it seems
> like a way cleaner interface. I also kept the suggested check_one_block().

Thank you for updating the patch! The patch looks good to me. Here are
some random comments mostly about cosmetic changes.

1.
I think we can have two separate SQL functions:
pg_check_relation(regclass, text) and pg_check_relation(regclass),
instead of setting NULL by default to the second argument.

2.
+ * Check data sanity for a specific block in the given fork of the given
+ * relation, always retrieved locally with smgrred even if a version exists in

s/smgrred/smgrread/

3.
+ /* The buffer will have to check checked. */
+ Assert(checkit);

Should it be "The buffer will have to be checked"?

4.
+ if (!is_member_of_role(GetUserId(), DEFAULT_ROLE_READ_SERVER_FILES))
+ ereport(ERROR,
+ (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+ errmsg("only superuser or a member of the
pg_read_server_files role may use this function")));

Looking at the definition of pg_stat_read_server_files role, this role
seems to be for operations that could read non-database files such as
csv files. Therefore, currently this role is used by file_fdw and COPY
command. I personally think pg_stat_scan_tables would be more
appropriate for this function but I'm not sure.

5.
+ /* Set cost-based vacuum delay */
+ ChecksumCostActive = (ChecksumCostDelay > 0);
+ ChecksumCostBalance = 0;

s/vacuum/checksum verification/

6.
+ ereport(WARNING,
+ (errcode(ERRCODE_DATA_CORRUPTED),
+ errmsg("invalid page in block %u of relation %s",
+ blkno,
+ relpath(relation->rd_smgr->smgr_rnode, forknum))));

I think it's better to show the relation name instead of the relation path here.

7.
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("relation \"%s\" does not have storage to be checked",
+ quote_qualified_identifier(
+ get_namespace_name(get_rel_namespace(relid)),
+ get_rel_name(relid)))));

Looking at other similar error messages we don't show qualified
relation name but the relation name gotten by
RelationGetRelationName(relation). Can we do that here as well for
consistency?

8.
+ if (!(rsinfo->allowedModes & SFRM_Materialize))
+ ereport(ERROR,
+ (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not " \
+ "allowed in this context")));

I think it's better to have this error message in one line for easy grepping.

Regards,

--
Masahiko Sawada http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Noah Misch 2020-04-05 06:53:28 Re: where should I stick that backup?
Previous Message Peter Geoghegan 2020-04-05 04:02:56 Re: Reinitialize stack base after fork (for the benefit of rr)?