Re: WAL consistency check facility

From: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>
To: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
Cc: Robert Haas <robertmhaas(at)gmail(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: WAL consistency check facility
Date: 2016-11-10 15:02:58
Message-ID: CAGz5QCKWrfnSTZEPJEWB6XQo0mMViOuFAM6dyNjLMojC5M=NXg@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Thu, Nov 10, 2016 at 10:25 AM, Michael Paquier
<michael(dot)paquier(at)gmail(dot)com> wrote:
> On Wed, Nov 9, 2016 at 11:32 PM, Kuntal Ghosh
>> Thanks a lot for reviewing the patch. Based on your review, I've attached the

>> I've done a fair amount of testing which includes regression tests
>> and manual creation of inconsistencies in the page. I've also found the
>> reason behind inconsistency in brin revmap page.
>> Brin revmap page doesn't have standard page layout. Besides, It doesn't update
>> pd_upper and pd_lower in its operations as well. But, during WAL
>> insertions, it uses
>> REGBUF_STANDARD to register a reference in the WAL record. Hence, when we
>> restore image before consistency check, RestoreBlockImage fills the space
>> between pd_upper and pd_lower(page hole) with zero. I've posted this as a
>> separate thread.
>> https://www.postgresql.org/message-id/flat/CAGz5QCJ%3D00UQjScSEFbV%3D0qO5ShTZB9WWz_Fm7%2BWd83zPs9Geg%40mail(dot)gmail(dot)com#CAGz5QCJ=00UQjScSEFbV=0qO5ShTZB9WWz_Fm7+Wd83zPs9Geg(at)mail(dot)gmail(dot)com
>
> Nice to have spotted the inconsistency. This tool is going to be useful..
>
> I have spent a couple of hours playing with the patch, and worked on
> it a bit more with a couple of minor changes:
> - In gindesc.c, the if blocks are more readable with brackets.
> - Addition of a comment when info is set, to mention that this is done
> at the beginning of the routine so as callers of XLogInsert() can pass
> the flag for consistency checks.
> - apply_image should be reset in ResetDecoder().
> - The BRIN problem is here:
> 2016-11-10 12:24:10 JST [65776]: [23-1] db=,user=,app=,client= FATAL:
> Inconsistent page found, rel 1663/16385/30625, forknum 0, blkno 1
> 2016-11-10 12:24:10 JST [65776]: [24-1] db=,user=,app=,client=
> CONTEXT: xlog redo at 0/9BD31148 for BRIN/UPDATE+INIT: heapBlk 100
> pagesPerRange 1 old offnum 11, new offnum 1
> 2016-11-10 12:24:10 JST [65776]: [25-1] db=,user=,app=,client=
> WARNING: buffer refcount leak: [4540] (rel=base/16385/30625,
> blockNum=1, flags=0x93800000, refcount=1 1)
> TRAP: FailedAssertion("!(RefCountErrors == 0)", File: "bufmgr.c", Line: 2506)
> Now the buffer refcount leak is not normal! The safest thing to do
> here is to release the buffer once a copy of it has been taken, and
> the leaks goes away when calling FATAL to report the inconsistency.
> - When checking for XLR_CHECK_CONSISTENCY, better to add a != 0 to get
> a boolean out of it.
> - guc_malloc() should be done as late as possible, this simplifies the
> code and prevents any memory leaks which is what your patch is doing
> when there is an error. (I have finally put my finger on what was
> itching me here).
> - In btree_mask, the lookup of BTP_DELETED can be deadly simplified.
> - I wondered also about putting assign_wal_consistency and
> check_wal_consistency in a separate file, say xlogparams.c, concluding
> that the current code does nothing bad either even if it adds rmgr.h
> in the list of headers in guc.c.
> - Some comment blocks are longer than 72~80 characters.
> - Typos!
All the changes make perfect sense to me.

> With the patch for BRIN applied, I am able to get installcheck-world
> working with wal_consistency = all and a standby doing the consistency
> checks behind. Adding wal_consistency = all in PostgresNode.pm, the
> recovery tests are passing. This patch is switched as "Ready for
> Committer". Thanks for completing this effort begun 3 years ago!
Thanks to you for reviewing all the patches in so much detail. Amit, Robert
and Dilip also helped me a lot in developing the feature. Thanks to them
as well.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Stephen Frost 2016-11-10 15:28:12 Re: Fix checkpoint skip logic on idle systems by tracking LSN progress
Previous Message Mithun Cy 2016-11-10 14:59:45 Re: Patch: Implement failover on libpq connect level.