Re: WAL consistency check facility

From: Michael Paquier <michael(dot)paquier(at)gmail(dot)com>
To: Robert Haas <robertmhaas(at)gmail(dot)com>
Cc: Kuntal Ghosh <kuntalghosh(dot)2007(at)gmail(dot)com>, Peter Eisentraut <peter(dot)eisentraut(at)2ndquadrant(dot)com>, Amit Kapila <amit(dot)kapila16(at)gmail(dot)com>, Peter Geoghegan <pg(at)heroku(dot)com>, Simon Riggs <simon(at)2ndquadrant(dot)com>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, Amit Kapila <amit(dot)kapila(at)enterprisedb(dot)com>, Andres Freund <andres(at)anarazel(dot)de>
Subject: Re: WAL consistency check facility
Date: 2017-02-01 02:31:12
Message-ID: CAB7nPqQwBHEA46Shf00BGn0uJfSgg0BGZ36S7K-aDPVNm5b6Cw@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-hackers

On Wed, Feb 1, 2017 at 1:06 AM, Robert Haas <robertmhaas(at)gmail(dot)com> wrote:
> On Thu, Jan 5, 2017 at 12:54 AM, Kuntal Ghosh
> <kuntalghosh(dot)2007(at)gmail(dot)com> wrote:
>> I've attached the patch with the modified changes. PFA.
>
> Can this patch check contrib/bloom?

Only full pages are applied at redo by the generic WAL facility. So
you would finish by comparing a page with itself in generic_redo.

> + /*
> + * For a speculative tuple, the content of t_ctid is conflicting
> + * between the backup page and current page. Hence, we set it
> + * to the current block number and current offset.
> + */
>
> Why does it differ? Is that a bug?

This has been discussed twice in this thread, once by me, once by Alvaro:
https://www.postgresql.org/message-id/CAM3SWZQC8nUgp8SjKDY3d74VLpdf9puHc7-n3zf4xcr_bghPzg%40mail.gmail.com
https://www.postgresql.org/message-id/CAB7nPqQTLGvn_XePjS07kZMMw46kS6S7LfsTocK%2BgLpTN0bcZw%40mail.gmail.com

> + /*
> + * Leave if no masking functions defined, this is possible in the case
> + * resource managers generating just full page writes, comparing an
> + * image to itself has no meaning in those cases.
> + */
> + if (RmgrTable[rmid].rm_mask == NULL)
> + return;
>
> ...and also...
>
> + /*
> + * This feature is enabled only for the resource managers where
> + * a masking function is defined.
> + */
> + for (i = 0; i <= RM_MAX_ID; i++)
> + {
> + if (RmgrTable[i].rm_mask != NULL)
>
> Why do we assume that the feature is only enabled for RMs that have a
> mask function? Why not instead assume that if there's no masking
> function, no masking is required?

Not all RMs work on full pages. Tracking in smgr.h the list of RMs
that are no-ops when masking things is easier than having empty
routines declared all over the code base. Don't you think so>

> + void (*rm_mask) (char *page, BlockNumber blkno);
>
> Could the page be passed as type "Page" rather than a "char *" to make
> things more convenient for the masking functions? If not, could those
> functions at least do something like "Page page = (Page) pagebytes;"
> rather than "Page page_norm = (Page) page;"?

xlog_internal.h is used as well by frontends, this makes the header
dependencies cleaner. (I have looked at using Page when hacking this
stuff, but the header dependencies are not worth it, I don't recall
all the details though this was a couple of months back).
--
Michael

In response to

Responses

Browse pgsql-hackers by date

  From Date Subject
Next Message Kyotaro HORIGUCHI 2017-02-01 03:05:40 Re: Logical Replication and Character encoding
Previous Message Michael Paquier 2017-02-01 02:11:19 Re: Proposal : For Auto-Prewarm.