Re: pgsql: Validate page level checksums in base backups

From: Michael Banck <michael(dot)banck(at)credativ(dot)de>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>, Andres Freund <andres(at)anarazel(dot)de>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Validate page level checksums in base backups
Date: 2018-04-04 12:19:30
Message-ID: 20180404121930.GE20852@nighthawk.caipicrew.dd-dns.de
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Hi,

On Wed, Apr 04, 2018 at 11:38:35AM +0200, Magnus Hagander wrote:
> On Tue, Apr 3, 2018 at 10:48 PM, Michael Banck <michael(dot)banck(at)credativ(dot)de>
> wrote:
>
> > Hi,
> >
> > On Tue, Apr 03, 2018 at 08:48:08PM +0200, Magnus Hagander wrote:
> > > On Tue, Apr 3, 2018 at 8:29 PM, Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> wrote:
> > > I'd bet a good lunch that nondefault BLCKSZ would break it, as well,
> > > > since the way in which the corruption is induced is just guessing
> > > > as to where page boundaries are.
> > >
> > > Yeah, that might be a problem. Those should be calculated from the block
> > > size.
> > >
> > > Also, scribbling on tables as sensitive as pg_class is just asking for
> > > > trouble IMO. I don't see anything in this test, for example, that
> > > > prevents autovacuum from running and causing a PANIC before the test
> > > > can complete. Even with AV off, there's a good chance that clobber-
> > > > cache-always animals will fall over because they do so many more
> > > > physical accesses to the system catalogs. I'd suggest inducing the
> > > > corruption in some user table(s) that we can more tightly constrain
> > > > the source server's accesses to.
> > >
> > > Yeah, that seems like a good idea. And probably also shut the server down
> > > while writing the corruption, just in case.
> > >
> > > Will stick looking into that on my todo for when I'm back, unless beaten
> > to
> > > it. Michael, you want a stab at it?
> >
> > Attached is a patch which does that hopefully:
> >
> > 1. creates two user tables, one large enough for at least 6 blocks
> > (around 360kb), the other just one block.
> >
> > 2. stops the cluster before scribbling over its data and starts it
> > afterwards.
> >
> > 3. uses the blocksize (and the pager header size) to determine offsets
> > for scribbling.
> >
> > I've tested it with blocksizes 8 and 32 now, the latter should make sure
> > that the first table is indeed large enough, but maybe something less
> > arbitrary than "10000 integers" should be used?
> >
> > Anyway, sorry for the hassle.
> >
>
> Applied, with the addition that I explicitly disabled autovacuum on those
> tables as well.

Thanks! It looks like there were no further builfarm failures so far,
let's see how this goes.

> We might want to enhance it further by calculating the figure 10,000 based
> on blocksize perhaps?

10,000 was roughly twice the size needed for 32k block sizes. If there
are concerns that this might not be enough, I am happy to invest some
more time here (next week probably). However, the pg_basebackup
testsuite takes up 800+ MB to run, so I don't see the urgent need of
optimizing away 50-100 KB (which clearly everybody else thought as well)
if we are talking about disk space overhead.

Michael

--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael(dot)banck(at)credativ(dot)de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Pavan Deolasee 2018-04-04 12:33:05 Re: pgsql: Optimize btree insertions for common case of increasing values
Previous Message Heikki Linnakangas 2018-04-04 11:40:46 pgsql: Fix the new ARMv8 CRC code for short and unaligned input.

Browse pgsql-hackers by date

  From Date Subject
Next Message Michael Banck 2018-04-04 12:23:48 Re: Online enabling of checksums
Previous Message Stephen Frost 2018-04-04 12:18:55 Re: Add default role 'pg_access_server_files'