Re: WAL consistency check facility

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>
Subject: Re: WAL consistency check facility
Date: 2016-09-13 06:34:46
Message-ID: CAB7nPqRsGtZy7NBQr-B1pkwyaeh=bmtuFg6Sb5ixEZk0xfnwSA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Mon, Sep 12, 2016 at 5:06 AM, Kuntal Ghosh
<kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
>>+ void (*rm_checkConsistency) (XLogReaderState *record);
>>All your _checkConsistency functions share the same pattern, in short
>>they all use a for loop for each block, call each time
>>XLogReadBufferExtended, etc. And this leads to a *lot* of duplication.
>>You would get a reduction by a couple of hundreds of lines by having a
>>smarter refactoring. And to be honest, if I look at your patch what I
>>think is the correct way of doing things is to add to the rmgr not
>>this check consistency function, but just a pointer to the masking
>>function.
>
> +1. In rmgrlist, I've added a pointer to the masking function for each rmid.
> A common function named checkConsistency calls these masking functions
> based on their rmid and does comparison for each block.

The patch size is down from 79kB to 38kB. That gets better :)

>>> - If WAL consistency check is enabled for a rmgrID, we always include
>>> the backup image in the WAL record.
>>
>>What happens if wal_consistency has different settings on a standby
>>and its master? If for example it is set to 'all' on the standby, and
>>'none' on the master, or vice-versa, how do things react? An update of
>>this parameter should be WAL-logged, no?
>
> If wal_consistency is enabled for a rmid, standby will always check whether
> backup image exists or not i.e. BKPBLOCK_HAS_IMAGE is set or not.
> (I guess Amit and Robert also suggested the same in the thread)
> Basically, BKPBLOCK_HAS_IMAGE is set if a block contains image and
> BKPIMAGE_IS_REQUIRED_FOR_REDO (I've added this one) is set if that backup
> image is required during redo. When we decode a wal record, has_image
> flag of DecodedBkpBlock is set to BKPIMAGE_IS_REQUIRED_FOR_REDO.

Ah I see. But do we actually store the status in the record itself,
then at replay we don't care of the value of wal_consistency at
replay. That's the same concept used by wal_compression. So shouldn't
you have more specific checks related to that in checkConsistency? You
actually don't need to check for anything in xlogreader.c, just check
for the consistency if there is a need to do so, or do nothing.

> For now, I've kept this as a WARNING message to detect all inconsistencies
> at once. Once, the patch is finalized, I'll modify it as an ERROR message.

Or say FATAL. This way the server is taken down.

> Thoughts?

A couple of extra thoughts:
1) The routines of each rmgr are located in a dedicated file, for
example GIN stuff is in ginxlog.c, etc. It seems to me that it would
be better to move each masking function where it should be instead
being centralized. A couple of routines need to be centralized, so I'd
suggest putting them in a new file, like xlogmask.c, xlog because now
this is part of WAL replay completely, including the lsn, the hint
bint and the other common routines.

2) Regarding page comparison:
+/*
+ * Compare the contents of two pages.
+ * If the two pages are exactly same, it returns BLCKSZ. Otherwise,
+ * it returns the location where the first mismatch has occurred.
+ */
+int
+comparePages(char *page1, char *page2)
We could just use memcpy() here. compareImages was useful to get a
clear image of what the inconsistencies were, but you don't do that
anymore.

3)
+static void checkConsistency(RmgrId rmid, XLogReaderState *record);
The RMGR if is part of the record decoded, so you could just remove
RmgrId from the list of arguments and simplify this interface.

4) If this patch still goes with the possibility to set up a list of
RMGRs, documentation is needed for that. I'd suggest writing first a
patch to explain what are RMGRs for WAL, then apply the WAL
consistency facility on top of it and link wal_consistency to it.

5)
+ has_image = record->blocks[block_id].has_image;
+ record->blocks[block_id].has_image = true;
+ if (!RestoreBlockImage(record, block_id, old_page))
+ elog(ERROR, "failed to restore block image");
+ record->blocks[block_id].has_image = has_image;
Er, what? And BKPIMAGE_IS_REQUIRED_FOR_REDO?

6)
+ /*
+ * Remember that, if WAL consistency check is enabled for
the current rmid,
+ * we always include backup image with the WAL record.
But, during redo we
+ * restore the backup block only if needs_backup is set.
+ */
+ if (needs_backup)
+ bimg.bimg_info |= BKPIMAGE_IS_REQUIRED_FOR_REDO;
This should use wal_consistency[rmid]?

7) This patch has zero documentation. Please add some. Any human being
on this list other than those who worked on the first versions
(Heikki, Simon and I?) is going to have a hard time to review this
patch in details moving on if there is no reference to tell what this
feature does for the user...

This patch is going to the good direction, but I don't think it's far
from being ready for commit yet. So I am going to mark it as returned
with feedback if there are no objections.
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Craig Ringer 2016-09-13 06:36:19 Re: 9.6 TAP tests and extensions
Previous Message Michael Paquier 2016-09-13 05:28:50 Re: OpenSSL 1.1 breaks configure and more