Re: WAL consistency check facility

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: WAL consistency check facility
Date: 2017-01-31 05:35:13
Message-ID: CAB7nPqQYuwYnyT0HX+LE2UJi5_CD_RM-Uihx6ZBq=XtNcfq2iQ@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Jan 5, 2017 at 2:54 PM, Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
> On Wed, Dec 21, 2016 at 10:53 PM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
>
>> On a first read-through of this patch -- I have not studied it in
>> detail yet -- this looks pretty good to me. One concern is that this
>> patch adds a bit of code to XLogInsert(), which is a very hot piece of
>> code. Conceivably, that might produce a regression even when this is
>> disabled; if so, we'd probably need to make it a build-time option. I
>> hope that's not necessary, because I think it would be great to
>> compile this into the server by default, but we better make sure it's
>> not a problem. A bulk load into an existing table might be a good
>> test case.
>>
> I've done some bulk load testing with 16,32,64 clients. I didn't
> notice any regression
> in the results.
>
>> Aside from that, I think the biggest issue here is that the masking
>> functions are virtually free of comments, whereas I think they should
>> have extensive and detailed comments. For example, in heap_mask, you
>> have this:
>>
>> + page_htup->t_infomask =
>> + HEAP_XMIN_COMMITTED | HEAP_XMIN_INVALID |
>> + HEAP_XMAX_COMMITTED | HEAP_XMAX_INVALID;
>>
>> For something like this, you could write "We want to ignore
>> differences in hint bits, since they can be set by SetHintBits without
>> emitting WAL. Force them all to be set so that we don't notice
>> discrepancies." Actually, though, I think that you could be a bit
>> more nuanced here: HEAP_XMIN_COMMITTED + HEAP_XMIN_INVALID =
>> HEAP_XMIN_FROZEN, so maybe what you should do is clear
>> HEAP_XMAX_COMMITTED and HEAP_XMAX_INVALID but only clear the others if
>> one is set but not both.
>>
> I've modified it as follows:
> +
> + if (!HeapTupleHeaderXminFrozen(page_htup))
> + page_htup->t_infomask |= HEAP_XACT_MASK;
> + else
> + page_htup->t_infomask |=
> HEAP_XMAX_COMMITTED | HEAP_XMAX_INVALID;
>
>> Anyway, leaving that aside, I think every single change that gets
>> masked in every single masking routine needs a similar comment,
>> explaining why that change can happen on the master without also
>> happening on the standby and hopefully referring to the code that
>> makes that unlogged change.
>>
> I've added comments for all the masking routines.
>
>> I think wal_consistency_checking, as proposed by Peter, is better than
>> wal_consistency_check, as implemented.
>>
> Modified to wal_consistency_checking.
>
>> Having StartupXLOG() call pfree() on the masking buffers is a waste of
>> code. The process is going to exit anyway.
>>
>> + "Inconsistent page found, rel %u/%u/%u, forknum %u, blkno %u",
>>
> Done.
>
>> Primary error messages aren't capitalized.
>>
>> + if (!XLogRecGetBlockTag(record, block_id, &rnode, &forknum, &blkno))
>> + {
>> + /* Caller specified a bogus block_id. Do nothing. */
>> + continue;
>> + }
>>
>> Why would the caller do something so dastardly?
>>
> Modified to following comment:
> + /*
> + * WAL record doesn't contain a block reference
> + * with the given id. Do nothing.
> + */
>
> I've attached the patch with the modified changes. PFA.

Moved to CF 2017-03 with same status, "ready for committer".
--
Michael

In response to

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Paquier 2017-01-31 05:38:15 Re: Speedup twophase transactions
Previous Message Nikhil Sontakke 2017-01-31 05:34:26 Re: Speedup twophase transactions