Re: pgsql: Validate page level checksums in base backups

From: Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us>
To: Magnus Hagander <magnus(at)hagander(dot)net>
Cc: Andres Freund <andres(at)anarazel(dot)de>, Michael Banck <michael(dot)banck(at)credativ(dot)de>, David Steele <david(at)pgmasters(dot)net>, PostgreSQL-development <pgsql-hackers(at)postgresql(dot)org>, pgsql-committers(at)lists(dot)postgresql(dot)org
Subject: Re: pgsql: Validate page level checksums in base backups
Date: 2018-04-03 18:29:11
Message-ID: 22322.1522780151@sss.pgh.pa.us
Views: Raw Message | Whole Thread | Download mbox | Resend email
Thread:
Lists: pgsql-committers pgsql-hackers

Magnus Hagander <magnus(at)hagander(dot)net> writes:
> Yeah, there's clearly a second problem here.

I think this test script is broken in many ways.

It's scribbling on the source cluster's disk files and assuming that that
translates one-for-one to what gets sent to the slave server --- but what
if some of the blocks that it modifies on-disk are resident in the
source's shared buffers? I think you'd have to shut down the source and
then apply the corruption if you want stable results.

I'd bet a good lunch that nondefault BLCKSZ would break it, as well,
since the way in which the corruption is induced is just guessing
as to where page boundaries are.

Also, scribbling on tables as sensitive as pg_class is just asking for
trouble IMO. I don't see anything in this test, for example, that
prevents autovacuum from running and causing a PANIC before the test
can complete. Even with AV off, there's a good chance that clobber-
cache-always animals will fall over because they do so many more
physical accesses to the system catalogs. I'd suggest inducing the
corruption in some user table(s) that we can more tightly constrain
the source server's accesses to.

regards, tom lane

In response to

Responses

Browse pgsql-committers by date

  From Date Subject
Next Message Peter Geoghegan 2018-04-03 18:45:03 Re: pgsql: Validate page level checksums in base backups
Previous Message Bruce Momjian 2018-04-03 18:01:23 pgsql: C comment: mention null handling in BuildTupleFromCStrings()

Browse pgsql-hackers by date

  From Date Subject
Next Message Peter Geoghegan 2018-04-03 18:45:03 Re: pgsql: Validate page level checksums in base backups
Previous Message Bruce Momjian 2018-04-03 18:01:45 Re: Comment update in BuildTupleFromCStrings()